0

I have 11 independent variables and one intercept. If I am right, that means there are $2^{12}=4096$ possible different models, right?

So does this mean, if I calculate each model and save the $R^2_{\text{adj}}$ and take the model with the highest $R^2_{\text{adj}}$, that it will be the best model (best fitting, best coefficients, best std. errors)?

mdewey
  • 16,541
  • 22
  • 30
  • 57
  • The brief answer is no. The problem is that higher adjusted R-squared does not always mean better model. What is your criteria for best coefficients and best standard errors by the way? – mpiktas Sep 27 '12 at 13:17
  • There are much more that 2^12 models if you allow interactions. – Stéphane Laurent Sep 27 '12 at 13:26
  • And even more if you allow for nonlinear relationships (polynomials, splines, etc.) – Greg Snow Sep 27 '12 at 13:37
  • ok, let us say, I do the normal linear regression analysis, so I use regsubsets with R, how much models does this evaluate in this case? @Greg Snow at mpiktas e.g. the criteria would be to have a good fit, so the adj. R-squared is ok as a criteria, is the resulting model with regsubsets the best model? – Corvvax Sep 27 '12 at 13:40
  • 4
    There is no doubt such a procedure has the best fit *to the data*. That's as fine as it goes, but as soon as you attempt to apply the results to anything else--to predict, to interpret, to estimate, to interpolate, to assess variability, to reason about *anything* other than the raw data--then it's highly likely your model does *not* have "best" coefficients (or even good ones) and it is almost certain its standard errors are too small. – whuber Sep 27 '12 at 14:28
  • I think a complication is with the intercept term. It is a factor in the modeling only if you have the choice to make it 0 or not. For the models that choose 0 for the y intercept I think the interpretation of R square changes and so the R squares and adjusted R squares may not be comparable to the other models. – Michael R. Chernick Sep 27 '12 at 17:28
  • Re your strategy for finding the best model in your second paragraph, you may want to read the answer I wrote here: [algorithms-for-automatic-model-selection](http://stats.stackexchange.com/questions/20836//20856#20856) to understand why that may not be a good idea. – gung - Reinstate Monica Sep 27 '12 at 22:25

1 Answers1

3

Consider the following example in R:

library(TeachingDemos)
library(leaps)
fit1 <- regsubsets( Evap~MaxST+MinST+AvST+MaxAT+MinAT+AvAT+MaxH+MinH+AvH+Wind, 
    data=evap, nbest=5)
sfit1 <- summary(fit1)
(w <- which.max(sfit1$adjr2))
    sfit1$which[w,]

fit2 <- regsubsets( Evap~MaxST+MinST+AvST+MaxAT+MinAT+AvAT+MaxH+MinH+AvH+Wind, 
    data=evap[-1,], nbest=5)
sfit2 <- summary(fit2)
(w2 <- which.max(sfit2$adjr2))
    sfit2$which[w2,]

This does your strategy above in finding the model with the highest adjusted $R^2$, then it redoes the analysis leaving out the first data point. The 2 fits give different "Best" models (the differences are in whether to use Max air temp or Min air temp and whether to include Wind). You could redo this for leaving out each different point.

Would you really be comfortable calling a model the "Best" model knowing that a small change to the dataset (collected 1 less point) would have given a different "Best" model?

Greg Snow
  • 46,563
  • 2
  • 90
  • 159
  • ok, and what about the number of possible combinations? Linear, without interactions (because I think regsubsets does not evaluate them right?)? Thanks @Greg Snow – Corvvax Sep 27 '12 at 14:30
  • The `regsubsets` function can give you every possible combination (my example always included the intercept, but you can change that), but it has an algorithm that lets it skip combinations that are very unlikely to be the "best". – Greg Snow Sep 27 '12 at 15:06
  • @GregSnow If you include the 0 intercept cases I think you can't compare the R square or adjusted R-square to the models with intercepts. So if you are really not sure that the intercept is not 0 you will need a different criteria for "best" to be able to compare all the models and find the minimum or maximum whatever the case may be. – Michael R. Chernick Sep 27 '12 at 17:32
  • @MichaelChernick, Yes, thank you. I meant to mention that $R^2$ and probably the adjusted version is computed differently without the intercept and the 2 methods are not comparable, but apparently forgot to include this important piece of information. – Greg Snow Sep 27 '12 at 18:50