Assume your outcome variable Y is continuous and the model includes only the race factor, so that:
Y = beta0 + beta1* race2 + beta2*race3 + ... +
beta-n-1*race-n + error
Then:
beta0 = mean value of Y when race is equal to
race1
beta0 + beta1 = mean value of Y when race is
equal to race2
beta0 + beta2 = mean value of Y when race is
equal to race3
and so on.
If you are interested in the difference in the mean value of Y between race3 and race2,
that will be given by:
(beta0 + beta2) - (beta0 + beta1) = beta2 - beta1
So you can set your contrast as:
c = (0, -1, 1, 0, ..., 0)
where the length of the contrast vector is the same as the number of beta coefficients in the model.
Comment:
When a model includes dummy variables used to encode the effect of a categorical variable, what that really means is that the model actually consists of a series of sub-models - one sub-model for each category of that variable. To write down each sub-model, simply set all the dummy variables to zero and then set each dummy variable to 1 in turns (while setting all other dummy variables to zero).
For the model:
Y = beta0 + beta1* race2 + beta2*race3 + ... +
beta-n-1*race-n + error (*),
the race variable is categorical with n categories and the dummy variables race2, ..., race-n are used to encode its effect on Y. (The race1 dummy variable was omitted from the model, reflecting the fact that race1 is treated as a reference category.)
Here are the n sub-models that can be derived from model (*).
Sub-model 1 corresponds to race = race1 and is obtained by setting all dummy variables in model (*) to 0. Its equation is given by:
Y = beta0 + error
In this sub-model, beta0 represents the mean value of Y when race = race1.
Sub-model 2 corresponds to race = race2 and is obtained by setting the dummy variable for race2 to 1 in model (*) and all other dummy variables to 0. Its equation is given by:
Y = beta0 + beta1 + error
In this sub-model, beta0 + beta1 represents the mean value of Y when race = race2.
...
Sub-model n corresponds to race = race-n and is obtained by setting the dummy variable for race-n to 1 in model (*) and all other dummy variables to 0. Its equation is given by:
Y = beta0 + beta-n + error
In this sub-model, beta0 + beta-n represents the mean value of Y when race = race-n.
The above sub-models help elucidate the interpretation of the parameters beta0, beta0 + beta1, ..., beta0 + beta2. Now we can construct differences between any of these parameters and interpret them. For example:
- (beta0 + beta1) - (beta0) = beta1 represents the difference in the mean value of y among people for whom race = race2 and those for whom race = race1.
- (beta0 + beta2) - (beta0 + beta1) = beta2 - beta1 represents the difference in the mean value of y among people for whom race = race3 and those for whom race = race2.