0

Let's say I get an R-squared value of 97% with 3 independent variables. Now, let's say that my R-squared value increases to 99% after adding 3 more independent variables (so total of 6 independent variables). Is it fair to say that the additional 2% variance is explained by the newly added 3 independent variables (and the remaining 1% is noise)?

Any quick response is really appreciated.

Thanks. Abhishek M

MAbhishek
  • 1
  • 1
  • 1
    The short answer is no. If you add enough independent variables, you can “explain” anything. Perhaps look up something like “adjusted R-squared”. – Ed V Apr 28 '20 at 17:18
  • As Ed V has mentioned, the R squared will keep increasing as you add more predictors [(see this)](https://stats.stackexchange.com/questions/207717/why-does-r2-grow-when-more-predictor-variables-are-added-to-a-model). The adjusted R squared will penalize the score as you add predictors [(see this)](https://stats.stackexchange.com/questions/52517/why-is-adjusted-r-squared-less-than-r-squared-if-adjusted-r-squared-predicts-the) – nwaldo Apr 28 '20 at 18:11
  • Thanks for the quick replies. I think I am understanding what you said specifically as it relates to 1% noise that I mentioned in my example. What about the 2% in my example? Can I say that the extra 2% variance is explained by the 3 new variables? – MAbhishek Apr 28 '20 at 18:24

1 Answers1

0

As Ed and nwaldo wrote, each variable added, even the worst variable imagined, will increase your r^2. So although the "simple" answer is that the extra 2% was "added" due to the new vars, this is statistically a wrong conclusion and a wrong procedure.

As r^2 will always increase, that is not much of a science just to add up more variables (try even with a random variable, and you'll see a small increase). Therefore as mentioned in the comments you must use R^2 adjusted or another form of measurement (AIC for example), to truly conclude that the addition added some explained variance.

HermanK
  • 75
  • 6