0

I have randomized controlled trial data with one control and two treatments. Since the DV is continuous, I estimate the treatment effect using an lm model, while controlling for a set of likely confounds, including demographic feature D. This is then followed by necessary diagnostics for linear models.

From that, I’m getting significant estimates for both of the treatments, compared to the control. But I’m curious about whether the treatment effect is substantially different for different values for D. There are three groups here, so call them D1, D2, and D3.

My question is how to investigate such potential differences. I see three approaches, but am not sure which one—if any—is the correct one:

  1. I can look at the coefficient estimate for the levels of D in the lm model. For example, if D1 is the reference level, I can see whether being in D2 or D3 has a (significant) positive/negative effect on the DV, compared to D1, holding fixed the other covariates.
  2. I can identify ‘representative’ participants (defined with reference to either the sample or the population) from each of D1, D2, and D3, and then predict (and possibly plot) response values for them, to compare.
  3. I can subset the data with reference to D, fit three models (for D1, D2, and D3, respectively) without the D coefficient, and then compare the treatment effects across the three models.

Are any of these reasonable ways to proceed?

kh_one
  • 315
  • 1
  • 9

1 Answers1

1

I think the typical approach is to fit the linear model with the two treatment dummies, two demographic dummies, and also pairwise interactions between included treatment and demographic dummies. The coefficients on the interactions tell you how the effect of that treatment is modified by the demographics.

Combining it into one model allows you test that the effects are the same much easier than the subset version (3). It is also easier to fit since it is just one model.

The first approach just tells you the effect of the demographics. It says nothing about how treatment and demographics interact.

The second approach still requires you (or your audience) to do the comparisons, so it seems less direct than the saturated regression.

In R, you can do this like this:

summary(lm(mpg~as.factor(vs)*as.factor(gear), data = mtcars))

Call:
lm(formula = mpg ~ as.factor(vs) * as.factor(gear), data = mtcars)

Residuals:
   Min     1Q Median     3Q    Max 
-7.440 -2.440  0.000  1.528  8.660 

Coefficients:
                                Estimate Std. Error t value Pr(>|t|)    
(Intercept)                       15.050      1.193  12.614 1.38e-12 ***
as.factor(vs)1                     5.283      2.668   1.980   0.0583 .  
as.factor(gear)4                   5.950      3.157   1.885   0.0707 .  
as.factor(gear)5                   4.075      2.386   1.708   0.0996 .  
as.factor(vs)1:as.factor(gear)4   -1.043      4.167  -0.250   0.8043    
as.factor(vs)1:as.factor(gear)5    5.992      5.336   1.123   0.2717    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 4.133 on 26 degrees of freedom
Multiple R-squared:  0.6056,    Adjusted R-squared:  0.5297 
F-statistic: 7.984 on 5 and 26 DF,  p-value: 0.0001142

Here having Straight Line engine improves MPG by 5.28, and having 5 forward gears adds 5.99 on top of that when coupled with the SL engine. Having 4 gears and and SL engine, the 5.28 is still there, but now the interaction is -1.043. All this means that the effect of SL is lower for 4 gear cars relative to 5 gear cars, but still higher than 3 gear cars.

You can test the joint null that the interaction coefficients are jointly zero with:

lm<-lm(mpg~as.factor(vs)*as.factor(gear), data = mtcars)
> linearHypothesis(lm, c("as.factor(vs)1:as.factor(gear)4=0", "as.factor(vs)1:as.factor(gear)5=0"))
Linear hypothesis test

Hypothesis:
as.factor(vs)1:as.factor(gear)4 = 0
as.factor(vs)1:as.factor(gear)5 = 0

Model 1: restricted model
Model 2: mpg ~ as.factor(vs) * as.factor(gear)

  Res.Df    RSS Df Sum of Sq      F Pr(>F)
1     28 472.77                           
2     26 444.15  2    28.617 0.8376 0.4441

Since the p-value is 0.44, you cannot reject the null that that both interactions are zero, so the data is consistent with all three effects being the same.

dimitriy
  • 31,081
  • 5
  • 63
  • 138
  • Thanks for your response, Dimitriy. In terms of interpreting the coefficients on the interactions, I take it that: if they do _not_ come out significant, I have no evidence of the treatment effect being substantially different across the relevant subgroups - is that correct? (And if they do come out significant, I do have such evidence.) – kh_one Jun 19 '20 at 09:58
  • Almost right, but "no evidence" is too strong. Above, neither interaction is significant, which means you can't distinguish between VS with 3 gears and VS with 4 gears and you can't distinguish between VS+3 gears and VS+5 gears. – dimitriy Jun 19 '20 at 10:22
  • 1
    If you want to say something about all three effects together, I would do a joint test that the interaction coefficients are both zero. You also have more evidence that there is something happening in 5 gear group, though it is still pretty weak. – dimitriy Jun 19 '20 at 10:24
  • This is really helpful. So, if we think of adding a Straight Line engine as the “treatment” here, the effect on mpg of adding a Straight Line engine for cars with 3 gears (the reference category) is 5.283. The effect of adding a Straight Line engine for cars with 4 (as opposed to 3) gears is 5.283 - 1.043 = 4.24, and for cars with 5 (as opposed to 3) gears it’s 5.283 + 5.992 = 11.275. Is that correct? And do these interpretations of the effect of an SL on mpg change if we add more covariates (cyl, disp, what have you)? – kh_one Jun 19 '20 at 11:22
  • 1
    That is perfect. The interpretation does not change if you add covariates as long as you also don’t interact them with treatment. – dimitriy Jun 19 '20 at 11:25
  • Fantastic. Sorry, one final thing: and the interpretation of treatment effects would be the same for a logistic regression in `glm`, the only difference being that effects would now be expressed in terms of differences in log odds, right? – kh_one Jun 19 '20 at 12:23
  • 1
    That is correct, by let me add two things. Without other covariates, your fully interacted (aka saturated) OLS model will give [the exact same predictions and marginal effects as a logit](https://stats.stackexchange.com/questions/311334/how-do-i-perform-a-statistical-test-for-a-difference-in-differences-analysis/311368#311368). Adding covariates will cause them to diverge, but usually not very much. But do you use robust SEs with OLS and binary outcomes. Second, you can calculate additive probability scale marginal effects as well [like this](https://stats.stackexchange.com/a/471851/7071). – dimitriy Jun 19 '20 at 15:45
  • Sorry for all the typos. Hope it still makes sense. – dimitriy Jun 19 '20 at 19:10