1

From Hayashi, 2000, p.20, I've understood that we can have some troubles in computing $R^2$ when the constant is not included in our model (e.g. negative $R^2$), but we can avoid them using $R^2_u = \hat{y}'\hat{y}/{y'y}$, the so-called uncentered $R^2$.

I've read also other discussion on the topic (e.g. removal-of-statistically-significant-intercept-term), but I have still some doubts:

Does the $R^2$ in a model without a constant become $R^2_u$ or the formulas stay different and we can look at $R^2_u$ as a solution in the case of constant absence?

How statistical softwares (r, stata) deal with this problem? I mean do they show automatically $R^2_u$ when the constant is not included or they show a wrong $R^2$?

PhDing
  • 2,470
  • 6
  • 32
  • 57
  • "Does the $R^2$ in a model without a constant become $R^2_u$?" Certainly not, as $R^2_u$ is always nonnegative but $R^2$ may not be in a model without constant. – Christoph Hanck Mar 24 '16 at 10:18
  • @ChristophHanck I was pretty sure about that but I've some problem in distinguishing between the two formulas. Does the $R^2$ still be $1-\frac{e'e}{y'M_{[1]}y}$? Even without the constant in the model I mean – PhDing Mar 24 '16 at 10:29
  • As $R_u^2$ is far from universal notation, you need to define it. – Nick Cox Mar 24 '16 at 10:33
  • @NickCox Sorry, I thought I wrote it – PhDing Mar 24 '16 at 10:35
  • 1
    Thanks; I suggest that writing it as a fraction makes the font painfully small, especially for people with small monitors, so I've tweaked it. – Nick Cox Mar 24 '16 at 10:41
  • OK, I tried an answer for that point, but the last two about R and Stata may be off-topic. – Christoph Hanck Mar 24 '16 at 10:48

1 Answers1

1

The definition of the formula stays the same even if there is no constant. If there is a constant, the lower bound for $R^2$ of zero can be obtained from \begin{eqnarray} y'M_{1}y&=&[M_{1}(Xb+e)]'[M_{1}(Xb+e)]\\ &=&b'X'M_{1}'M_{1}Xb+2e'M_{1}Xb+e'M_{1}'M_{1}e\\ &=&b'X'M_{1}Xb+2e'M_{1}Xb+e'M_{1}e\\ &=&b'X'M_{1}Xb+2e'Xb+e'e\\ &=&b'X'M_{1}Xb+e'e\\ &\geqslant&e'e \end{eqnarray} Here, the first three lines just multiply out and exploit symmetry and idempotency of $M_1$.

In the fourth line, we exploit that regressors and residuals are orthogonal in the sense that $X'e=0$. If $1\in\langle X\rangle$ (the case if there is a constant), we thus also have $1'e=0$ and hence $M_1e=(I-1(1'1)^{-1}1')e=e$.

This also means that $R^2$ is not necessarily larger than zero if $1\not\in\langle X\rangle$, as we would then have $M_{1}e\neq e$ in the 4th equality and hence $$e'M_{1}X\neq e'X=0,$$ and it would not necessarily be the case that $y'M_{1}y\geqslant e'e $ and thus $R^2\leqslant0$ becomes possible.

$R_u^2$ is by construction always between 0 and 1, but it is not a flawless alternative because it is not invariant against, for example, adding constant numbers $a$ to $y$, so that you would for example get different $R_u^2$ when explaining changing temperatures in Celsius or Fahrenheit.

Christoph Hanck
  • 25,948
  • 3
  • 57
  • 106