Expected value of $R^2$, the coefficient of determination, under the null hypothesis

Question

I am curious about the statement made at the bottom of the first page in this text regarding the $R^2_\mathrm{adjusted}$ adjustment

$$R^2_\mathrm{adjusted} =1-(1-R^2)\left({\frac{n-1}{n-m-1}}\right).$$

The text states:

The logic of the adjustment is the following: in ordinary multiple regression, a random predictor explains on average a proportion $1/(n – 1)$ of the response’s variation, so that $m$ random predictors explain together, on average, $m/(n – 1)$ of the response’s variation; in other words, the expected value of $R^2$ is $\mathbb{E}(R^2) = m/(n – 1)$. Applying the [$R^2_\mathrm{adjusted}$] formula to that value, where all predictors are random, gives $R^2_\mathrm{adjusted} = 0$."

This seems to be a very simple and interpretable motivation for $R^2_\mathrm{adjusted}$. However, I have not been able to work out that $\mathbb{E}(R^2)=1/(n – 1)$ for single random (i.e. uncorrelated) predictor.

Could someone point me in the right direction here?

In case the link goes dead in the future, could you provide a full reference? Thank you. — Richard Hardy, Apr 24 '18 at 08:35

score 10 · Accepted Answer · edited Apr 13 '17 at 12:44

10

This is accurate mathematical statistics. See this post for the derivation of the distribution of $R^2$ under the hypothesis that all regressors (bar the constant term) are uncorrelated with the dependent variable ("random predictors").

This distribution is a Beta, with $m$ being the number of predictors without counting the constant term, and $n$ the sample size,

$$R^2 \sim Beta\left (\frac {m}{2}, \frac {n-m-1}{2}\right)$$

and so

$$E(R^2) = \frac {m/2}{(m/2)+[(n-m-1)/2]} = \frac{m}{n-1}$$

This appears to be a clever way to "justify" the logic behind the adjusted $R^2$: if indeed all regressors are uncorrelated, then the adjusted $R^2$ is "on average" zero.

edited Apr 13 '17 at 12:44

Community

1

answered Nov 13 '15 at 23:19

Alecos Papadopoulos

52,923
5
131
241

2

Just the bit of information I needed! Thank you! And long live Stack Exchange! – gregory_britten Nov 14 '15 at 23:06
1

I'd be interested in the case where not all regressors are uncorrelated with the dependent variable. Would you have any reference about this? – Olivier Feb 23 '18 at 17:03
@Olivier No I am afraid not. Look under "F-test for regression significance, distribution under the alternative", or something like that. – Alecos Papadopoulos Feb 23 '18 at 18:04

Expected value of $R^2$, the coefficient of determination, under the null hypothesis

1 Answers1

Linked