What is the baseline level in a factor-by-factor interaction?

Question

What is the baseline level for a factor-by-factor interaction term in multiple regression?

Consider this example from Fox 2003. In the regression below, these two variables are categorical: year={1997,..,2002} and colour={black,white}.

require(effects)
require(lmtest)
Arrests$year <- as.factor(Arrests$year)
arrests.mod <- glm(released ~ employed + citizen + checks
                         + colour*year + colour*age,
                         family=binomial, data=Arrests)

Which yields:

> coeftest(arrests.mod)

z test of coefficients:

                       Estimate Std. Error  z value  Pr(>|z|)    
(Intercept)           0.3444334  0.3100749   1.1108 0.2666514    
employedYes           0.7350645  0.0847701   8.6713 < 2.2e-16 ***
citizenYes            0.5859841  0.1137717   5.1505 2.598e-07 ***
checks               -0.3666425  0.0260322 -14.0842 < 2.2e-16 ***
colourWhite           1.2125167  0.3497751   3.4666 0.0005272 ***
year1998             -0.4311794  0.2603589  -1.6561 0.0977023 .  
year1999             -0.0944343  0.2615447  -0.3611 0.7180519    
year2000             -0.0108975  0.2592073  -0.0420 0.9664655    
year2001              0.2430630  0.2630151   0.9241 0.3554129    
year2002              0.2129549  0.3532786   0.6028 0.5466444    
age                   0.0287279  0.0086191   3.3330 0.0008590 ***
colourWhite:year1998  0.6519565  0.3134898   2.0797 0.0375555 *  
colourWhite:year1999  0.1559504  0.3070430   0.5079 0.6115161    
colourWhite:year2000  0.2957537  0.3062034   0.9659 0.3341076    
colourWhite:year2001 -0.3805413  0.3040538  -1.2516 0.2107305    
colourWhite:year2002 -0.6173178  0.4192551  -1.4724 0.1409086    
colourWhite:age      -0.0373729  0.0102003  -3.6639 0.0002484 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

In the table above, I interested in identifying the baseline level for the factor by factor interaction term... For instance, group colourWhite:year1998 is compared to which other group?

Is colourWhite:year1997 the baseline level, or perhaps colourBlack:year1997?

See also: http://stats.stackexchange.com/questions/122246/interpretation-of-interaction-term?rq=1 — landroni, May 04 '15 at 09:55

score 2 · Accepted Answer · answered Apr 18 '15 at 17:02

2

The reference category is the combination of first levels of the factors in the model:

> with(Arrests, levels(colour))
[1] "Black" "White"
> with(Arrests, levels(year))
[1] "1997" "1998" "1999" "2000" "2001" "2002"

So the Intercept is for colourBlack:year1997 and the default contrasts specify differences in means with is class and the other combinations of factors involved in your model specification, hence colourWhite reflects the difference in $E(y)$ for the combination colourWhite:year1997. you can think of this really as

colourBlack:year1997 + colourWhite

as colourWhite represents the difference in 1997 for colour White.

The other interaction terms in the model are the additional differences for colour White in the other years, whilst the year main effects are the differences in $E(y)$ between the reference year and the other years for colourBlack`.

Looking at the model matrix can often help in deciphering these things:

> head(model.matrix(~ colour * year + colour * age, data = Arrests))
  (Intercept) colourWhite year1998 year1999 year2000 year2001 year2002 age
1           1           1        0        0        0        0        1  21
2           1           0        0        1        0        0        0  17
3           1           1        0        0        1        0        0  24
4           1           0        0        0        1        0        0  46
5           1           0        0        1        0        0        0  27
6           1           0        1        0        0        0        0  16
  colourWhite:year1998 colourWhite:year1999 colourWhite:year2000
1                    0                    0                    0
2                    0                    0                    0
3                    0                    0                    1
4                    0                    0                    0
5                    0                    0                    0
6                    0                    0                    0
  colourWhite:year2001 colourWhite:year2002 colourWhite:age
1                    0                    1              21
2                    0                    0               0
3                    0                    0              24
4                    0                    0               0
5                    0                    0               0
6                    0                    0               0

The look at the first few rows of the data to see how the dummy variables indicate the various groupings indicated by the combinations of factors and their interactions.

answered Apr 18 '15 at 17:02

Gavin Simpson

37,567
5
110
153

Thank you so much for this exhaustive explanation! What about comparisons not directly displayed in the regression table? For instance, `colourWhite:year2000` vs `colourWhite:year2002`. Aside from the option of changing the reference level and re-estimating the regression, is it OK to use the `effects` package to graphically display the *effects* of all possible groups, and then draw conclusions from the graphs? This is very much related to this question: http://stats.stackexchange.com/questions/146715/how-to-interpret-factor-by-factor-interactions – landroni Apr 19 '15 at 00:12
*The other interaction terms in the model are the additional differences for colour White in the other years* By this do you mean that `colourWhite:year1998` is contrasted with `colourWhite` (which effectively represents `colourWhite:year1997`)? Thanks! – landroni Apr 19 '15 at 19:57
1

"What about comparisons not directly shown in the regression table?" That table just shows the parameters of the particular model parameterisation you have used. To get the comparison you want, you need *post hoc* comparisons. The **multcomp** package is particularly strong on this. **effects** is good too and whilst it generates the effects displays it does also provide the statistics behind the plots. – Gavin Simpson Apr 19 '15 at 21:43
1

For your second comment, yes; if you are `colourWhite` in 1997 then you take the intercept plus `colourWhite` coefs. If you are `colourWhite` in 1998 you want intercept, `colourWhite`, `year1998`, & `colourWhite:year1998` coefficients to get the mean for that combination. – Gavin Simpson Apr 19 '15 at 21:45
*"If you are `colourWhite` in 1998 you want `intercept`, `colourWhite`, `year1998`, & `colourWhite:year1998`"* Are you sure about including `year1998` for `colourWhite` in 1998? From your answer my understanding was that (1) for **`colourWhite` in 1998** we would take `intercept`, `colourWhite` & `colourWhite:year1998`, whereas (2) for **`colourBlack` in 1998** we would take `intercept` & `year1998`. Did I get that wrong? – landroni Apr 20 '15 at 09:43
1

Yes, `year1998` is the modification you need to make if you are in 1998; all samples in 1998 take this modification. The `colourWhite:year1998` is an *extra* modification that you take *only* if you are that specific combination. If you look at the model matrix (a shown in my Answer) the parametrisation is essentially a set of indicator variables. If you look closely at rows in the model matrix that correspond to rows in the data with `colour == "White" & year == "1998"` you'll see that all such entries have a `1` in the `year1988` column. – Gavin Simpson Apr 20 '15 at 16:00
Very interesting. So after the intercept, all remaining terms related to an interaction (both main-term and interaction-term regressors) are merely sequential additional differences. But then what do the statistical significances actually represent? For instance, coefficient for `year1998` (`-0.4311794`) shows that Blacks in 1998 have a lower $E(y)$ than Blacks in 1997, correct? Whereas coefficient for `colourWhite:year1998` (`colourWhite:year1998`) has no literal interpretation from what I see... Am I getting any closer? – landroni Apr 22 '15 at 12:55
Thank you for all the helpful explanations and pointers. `multcomp` does indeed multiple comparisons (in some respects better than `effects`), as does `lsmeans`. All three packages overlap in scope, but differ somewhat in implementation... – landroni Apr 24 '15 at 13:36

What is the baseline level in a factor-by-factor interaction?

1 Answers1

Linked