3

I'm struggling to figure out how these adjusted $R^2$ values for linear regression were calculated with $n=8$ observations:

enter image description here

Footnote 124 says that for a model with just an intercept, $RSS$ (residual sum of squares) equals $TSS$ (total sum of squares). So using $R^2=1-\frac{RSS}{TSS}$, we get $R^2=0$ for the model with just an intercept. Then I use the formula $$R^2_{adj} = 1-\left((1-R^2)\frac{n-1}{n-k-1}\right)$$ where $n$ is the number of observations (here $n=8$), and $k$ is the number of slopes (not including the intercept). So for the model with just the intercept, I get $$R^2_{adj} = 1-\left((1-0)\frac{8-1}{8-1}\right) = 0$$ whereas the book has $0.4077$.

I get a different answer for the other models as well. For instance, for the model only using $X_2$ I get $$R^2_{adj} = 1-\left(\frac{6981.58}{10693.5}\cdot \frac{8-1}{8-2}\right)=0.2383.$$

For the model with $X_1$ and $X_2$: $$R^2_{adj} = 1-\left(\frac{915.375}{10693.5}\cdot\frac{8-1}{8-3}\right) = 0.8802$$

For the model using all three predictors: $$R^2_{adj} = 1-\left(\frac{908.166}{10693.5}\cdot\frac{8-1}{8-4}\right) = 0.8514.$$

What am I missing or doing wrong?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
kccu
  • 131
  • 1
  • Are you sure that $r^2$ is the residual sos divided by the total? – mdewey Mar 06 '21 at 16:47
  • Could you please edit the question to include a citation of and link to the book you are quoting from? – EdM Mar 06 '21 at 17:02
  • @EdM The book is not freely available online. It is Howard Mahler's Guide to Statistical Learning: http://www.howardmahler.com/Teaching/MAS-1.html – kccu Mar 06 '21 at 17:57
  • @mdewey No, $R^2$ is the model sum of squares divided by the total sum of squares. Since $TSS=MSS+RSS$, $R^2 = \frac{MSS}{TSS}=1-\frac{RSS}{TSS}$. The formula for $R^2_{adj}$ uses $1-R^2$, which is equal to $\frac{RSS}{TSS}$. All these formulas are on the Wikipedia page for coefficient of determination, and in the textbook as well: https://en.wikipedia.org/wiki/Coefficient_of_determination – kccu Mar 06 '21 at 18:00
  • There seem to be two different models with X1 and X3 as predictors, with different coefficients. Seems a typo; the third one likely involved X2 and X3. – Marjolein Fokkema Mar 07 '21 at 02:18
  • @MarjoleinFokkema Yes I noticed that as well. I haven't checked the fitted regressions to the data, I just trusted that the RSS values were correct. – kccu Mar 07 '21 at 15:38

2 Answers2

1

Not a definitive answer but from what I gathered, there are different formulas for calculating the adjusted R-squared. The adjusted R-squared tries to express the proportion of variance explained by a model on a population level. Since this is not an easy thing to estimate, there have been different proposals for calculating the adjusted R-squared. Some of the different versions include:

  • Wherry’s formula: $1-(1-R^2)\frac{(n-1)}{(n-v)}$
  • McNemar’s formula: $1-(1-R^2)\frac{(n-1)}{(n-v-1)}$
  • Lord’s formula: $1-(1-R^2)\frac{(n+v-1)}{(n-v-1)}$
  • Stein's formula: $1-\big[\frac{(n-1)}{(n-k-1)}\frac{(n-2)}{(n-k-2)}\frac{(n+1)}{n}\big](1-R^2)$

An often cited study in this context is Yin and Fan (2001), which is a comparison study of different R-squared versions based on simulated data. See also these three questions about this issue:

What is the adjusted R-squared formula in lm in R and how should it be interpreted?

Would the real adjusted R-squared formula please step forward?

What is an unbiased estimate of population R-square?

For your specific example I did not get the shown solutions with any of the above listed formulas, but I guess it is possible that the author used yet another formula? Perhaps footnot/reference 125 in your passage gives some indication of what was used?

Reference:

  • Yin, P., & Fan, X. (2001). Estimating $R^2$ shrinkage in multiple regression: A comparison of different analytical methods. The Journal of Experimental Education, 69(2), 203-224.
YR2018
  • 31
  • 2
  • 1
    All of these formulas should still yield an adjusted $R^2$ of $0$ for the model with just an intercept though, correct? I don't understand how the text has a nonzero adjusted $R^2$ for that model. – kccu Mar 07 '21 at 15:39
  • Also the formula I used is the only one presented in the text the screenshot is from. – kccu Mar 07 '21 at 15:40
  • Wherry's formula above does not yield 0 for the model with just an intercept...But it doesn't result in the value of your passage. Perhaps its easiest to write the author and ask directly what software/formula he used for his calculations. – YR2018 Mar 08 '21 at 20:40
  • Ah, you are correct about Wherry's formula (and Stein's as well?). But the adjusted $R^2$ of $0.4077$ still seems entirely too large for a model with just an intercept. – kccu Mar 16 '21 at 18:06
0

The text is almost certainly in error. This is a danger with self-published works, as this appears to be. Without peer review and editors, mistakes easily creep in.

As the Wikipedia page says:

The adjusted $R^2$ can be negative, and its value will always be less than or equal to that of $R^2$.

There is no way that an intercept-only model can have an $R^2$ other than 0, so for the text to claim a substantial positive adjusted $R^2$ for that model (0.4077) must be an error.

We can also examine the implications of the claim that the 3-predictor model has an adjusted $R^2$ of 0.912. That is lower than the unadjusted $R^2$ of 0.915 so it can't be completely ruled out. But what would that mean for the relationship between $n$ and $p$? With those values and the adjusted $R^2$ formula, I get:

$$ \frac{n-1}{n-p-1}= 1.035$$

or $n \approx 1 +30p$. That's not compatible with the assumed $n=8$, particularly not if $p=3$.

I suppose it's possible that there's some explanation for such discrepancies hidden in portions of the text, but I doubt it. Get in touch with the author to clarify,

EdM
  • 57,766
  • 7
  • 66
  • 187