I have randomized controlled trial data with one control and two treatments. Since the DV is continuous, I estimate the treatment effect using an lm
model, while controlling for a set of likely confounds, including demographic feature D. This is then followed by necessary diagnostics for linear models.
From that, I’m getting significant estimates for both of the treatments, compared to the control. But I’m curious about whether the treatment effect is substantially different for different values for D. There are three groups here, so call them D1, D2, and D3.
My question is how to investigate such potential differences. I see three approaches, but am not sure which one—if any—is the correct one:
- I can look at the coefficient estimate for the levels of D in the
lm
model. For example, if D1 is the reference level, I can see whether being in D2 or D3 has a (significant) positive/negative effect on the DV, compared to D1, holding fixed the other covariates. - I can identify ‘representative’ participants (defined with reference to either the sample or the population) from each of D1, D2, and D3, and then
predict
(and possibly plot) response values for them, to compare. - I can
subset
the data with reference to D, fit three models (for D1, D2, and D3, respectively) without the D coefficient, and then compare the treatment effects across the three models.
Are any of these reasonable ways to proceed?