2

I am quite new in R and am on a stage of running a regression model there. The approach we have chosen is linear regression with dummy variables. As far as my knowledge and experience go when using dummies one should choose a null interval - class(group) within a variable that will receive 0 points. This should help tackle the multicollinearity.

My question is - how do we set (if we can) the null interval? I reviewed the "dummies" package but did not see an option there.

Thanks, N

Bullzeye
  • 33
  • 4
  • Not sure what you mean by "null interval" here. You might find the thread [Understanding Dummy Variable Creation in GLM](http://stats.stackexchange.com/questions/94010/understanding-dummy-manual-or-automated-variable-creation-in-glm) useful? – goangit Dec 01 '14 at 09:12
  • The question is not about statistics but about R programming. Anyway you might check the "relevel" command in R. There you can choose the reference category. – Michael M Dec 01 '14 at 14:41

1 Answers1

2

You seem to be asking "how do I choose which is the baseline group?" (the group that is subsumed into the intercept term). In R, if you have a factor, the default baseline is the first level in the factor.

You can change which level is the first easily with relevel (see ?relevel, particularly the examples at the bottom), or you can completely reorder the factor, either via reorder or by calling factor on the factor with the newly ordered levels respecified.

(You may also be able to do something useful with the contrast function and some of the related functions.)

With non-large data sets, I find myself using factor to reorder more often lately.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • Thanks for that answer! And yes... I meant - reference group! It may not be popular but I have ofter refer to it as the null group (cause all the dummies are 0). Cheers! – Bullzeye Dec 02 '14 at 06:53
  • 1
    "null group" would be reasonably well understood. However, "null interval" is not; you wouldn't typically refer to a group as an interval. [Unless,... hmm ... are you splitting up a continuous variable? Generally speaking that's a bad idea.] – Glen_b Dec 02 '14 at 07:28