0

I have a simple linear model (earnings ~ age), and I want to test if the coefficient on age is significantly different for separate subgroups in my data (those with a bachelor's degree, and those without) in r, with the null hypothesis that 'coefficient(age_bachelor=0) = coefficient(age_bachelor=1'.

I created the following two models:

lm1 = lm(earnings ~ age, data=subset(mydata, bachelor==0))  

lm2 = lm(earnings ~ age, data=subset(mydata, bachelor==1))

How do I test if the coefficients of age between the two models are significantly different?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
  • Its usually better to fit a joint model with an interaction term, see [Separate Models vs Flags in the same model](https://stats.stackexchange.com/questions/373890/separate-models-vs-flags-in-the-same-model) – kjetil b halvorsen Oct 07 '21 at 18:58

1 Answers1

2

I was going to answer in comments, but:

lm(earnings ~ age*bachelor, data=mydata)

and look at the p-value for the interaction coefficient.

The interaction model is not identical to fitting separate models and comparing parameters — it assumes that the residual variance is the same in both groups — but it's sensible, and the standard approach.

Ben Bolker
  • 34,308
  • 2
  • 93
  • 126
  • +1. https://stats.stackexchange.com/questions/17110 and https://stats.stackexchange.com/questions/13112 (*inter alia*) discuss the difference between including the interaction and running a separate regression for each group. – whuber Oct 07 '21 at 21:23