We have a variable "pos" in the regression which has three values : guard, forward, and center. My regression looks like y~a+factor(pos) But one of the factor's value is insignificant (i.e. forward). How can we remove it from the regression without eliminating others in R?
-
3Note you probably don't want to (see e.g. [here](http://stats.stackexchange.com/questions/66448/should-covariates-that-are-not-statistically-significant-be-kept-in-when-creat/66454)). – Scortchi - Reinstate Monica Nov 22 '13 at 11:33
2 Answers
If you remove one of the factors you will change the meaning of the reference category and thereby the meaning of the effects of the remaining factors. So that is almost always a very bad idea.
Say the reference category is guard, then your regression equation becomes:
$\hat{y} = a + b_1 \mathrm{forward} + b_2\mathrm{center}$
$b_2$ is the difference in expected value of $y$ between guard and center.
If we remove forward from our model, the reference category now becomes guard and forward. So now $b_2$ compares center against guard and center.
Even more generally, if you thought it was a good idea to include a variable in your model, then you should keep it in there even if it turned out not to be significant.

- 27,560
- 8
- 81
- 248

- 19,189
- 29
- 59
You could do this by changing the data set, but you should not do it. You are evidently comparing some feature of basketball players. That could be useful (depending on what y is, of course). But if you remove the forwards, then you are creating a model for just guards and centers. This will change the parameters.
If y is, say, height, then you should probably set the reference level to either "guard" or "center" (the shortest or the tallest). You will then be able to compare the other two positions to that one.

- 94,055
- 35
- 143
- 276