In brief: How to code a contrast matrix for repeated contrasts (comparing adjacent levels) where the intercept corresponds with the grand mean?
Example-Problem:
- A factor of 10 levels.
- Each contrast should compare one level with the next. (lv1 vs lv2, lv2 vs lv3, etc.)
- The intercept should be the grand mean.
Example-Solution:
Unfortunately, I fail to transfer the Example-Solution (for a three-level factor) to a meaningful contrast.matrix for the Example-Problem. That is, a meaningful contrast.matrix that is understood by a linear model.
EDIT: I seem to have found a solution. However, I don't understand WHY said solution works. Hence, please advice/explain if you can:
Primarily, there appear to be confusing names used for the contrast scheme that I'm interested in. In this article by UCLA (https://stats.idre.ucla.edu/r/library/r-library-contrast-coding-systems-for-categorical-variables/#forward) it is referred to as "forward difference coding." Very different from "difference coding," which is referred to as reverse Helmert coding. In another article by UCLA (https://stats.idre.ucla.edu/spss/faq/coding-systems-for-categorical-variables-in-regression-analysis-2/#DUMMYCODING) the same method is referred to as "repeated effect coding."
Secondly, in the above (see picture) Example-Solution (What is a contrast matrix?), the corresponding matrix seems to be C. It is again called "repeated contrasts." Here, I don't understand: What is the purpose (differentiationg to UCLA's design) of the constant of 1s? That is, c(1, 1, 1).
Now, transferring the UCLA design to my example of a 10-level factor, this seems to be the wanted solution:
Here, two main problems/questions that remain:
A similar design as described by the Example-Solution from Cross-Validated (namely, adding a column with constant 1s), will yield a model error of "deficient matrix rank." That is, I don't understand why there is a proposed constant as such (c(1, 1, ..., 1)), and why it does apparently not work for my model, nor it being described by UCLA.
I don't independently understand the above contrast matrix. That is, to me (in theory) it looks as if we compare (column-wise): Lv1 to all other lvls, average(lv1 and lv2) to all other lvls, etc...