1

If I have a categorical variable with three levels (CatXVar) that I recode into two dummy variables (NYXVar and BostonXVar)such that:

YVar ContXVar  CatXVar NYXVar BostonXVar
0.23 10        NY      1      0
0.1  22.3      Boston  0      1
0.52 11.9      London  0      0

and I want to see whether CatXVar affects the significance of any relationship between YVar and ContXVar, should I run two separate regressions of:

Yvar ~ ContXVar + NYXVar

and

Yvar ~ ContXVar + BostonXVar

or should I run the regression as:

Yvar ~ ContXVar + NYXVar + BostonXVar
Kaleb
  • 255
  • 3
  • 11
  • If you exclude the intercept from your model you can run it using CarXVar as the predictor. – Mike Hunter Nov 01 '15 at 19:32
  • I've read that I should convert k-level categorical variables into k-1 binary dummy variables though? – Kaleb Nov 02 '15 at 08:06
  • Not if you don't have an intercept. – Mike Hunter Nov 02 '15 at 09:52
  • How do I get rid of the intercept in R with glm? – Kaleb Nov 02 '15 at 12:50
  • 1
    You can remove the intercept by including `-1` in the model specification. You might want to consider reading [this](http://stats.stackexchange.com/questions/7948/when-is-it-ok-to-remove-the-intercept-in-lm) before doing that. `Yvar ~ ContXVar + NYXVar + BostonXVar` seems to be the safe bet here. – horseoftheyear Nov 04 '15 at 14:35
  • Yeah why not. Added it as answer. Might trigger some useful feedback/insights. – horseoftheyear Nov 04 '15 at 17:56

1 Answers1

1

You can remove the intercept, as suggested in the comments, by including -1 in the model specification. However, you might want to consider reading this before doing that.

Yvar ~ ContXVar + NYXVar + BostonXVar seems to be the safe bet here.

Running two separate regressions essentially means you're estimating two separate models as you don't account for the New York/Boston effect respectively.

horseoftheyear
  • 508
  • 6
  • 12