9

The Wikipedia page on R2 says $R^2$ can take on a value greater than 1. I don't see how this is possible.

Values of $R^2$ outside the range 0 to 1 can occur where it is used to measure the agreement between observed and modeled values and where the "modeled" values are not obtained by linear regression and depending on which formulation of $R^2$ is used. If the first formula above is used, values can be less than zero. If the second expression is used, values can be greater than one.

That quote refers to the "second expression" but I don't see a second expression on the page.

Is there any scenario where $R^2$ can be greater than 1? I am thinking about this question for nonlinear regression, but would like to get a general answer.

[For someone looking at this page with the opposite question in mind: Yes; $R^2$ can be negative. This happens when you fit a model that fits the data worse than a horizontal line. This would usually be due to a mistake in selecting a model or constraints.]

Firebug
  • 15,262
  • 5
  • 60
  • 127
Harvey Motulsky
  • 14,903
  • 11
  • 51
  • 98
  • 6
    This issue has already been treated at least once on this website https://stats.stackexchange.com/questions/251337 and I imagine that there are more questions that relate to it or completely explain it. $$ SST (total) = RSS (model) + SSE (error) $$ $SS_t>SS_e$, this is only true in general if the model includes an intercept and if the mean of the error/residual is 0. *If $R^2$ relates, most simply, to correlation, and there are no corrections, then it must indeed be no greater than 1. It is just that it is not always calculated in the same way as a correlation.* – Sextus Empiricus Mar 16 '18 at 23:28
  • 1
    So you have the two expressions: $$R^2 = 1- SS_e/SS_t = SS_m/SS_t$$ it is possible that $SS_m>SS_t$ – Sextus Empiricus Mar 16 '18 at 23:37
  • I calculate R-squared as "1.0 - (absolute_error_variance / dependent_data_variance)" and since the absolute error variance cannot be less than zero, in my calculations the maximum value of R-squared is 1.0 – James Phillips Mar 17 '18 at 00:59
  • 2
    It's quirks like these that hold me to thinking that $R^2$ is best taken in general to be the square of the correlation between observed and predicted. – Nick Cox Mar 19 '18 at 18:54
  • If R squared more than one that means 1+1 is more than 2 – Ibrahim Jan 17 '19 at 23:26

2 Answers2

11

I found the answer, so will post the answer to my question. As Martijn pointed out, with linear regression you can compute $R^2$ by two equivalent expressions:

$R^2 = 1- SS_e/SS_t = SS_m/SS_t$

With nonlinear regression, you cannot sum the sum-of-squares of residuals and sum-of-squares of the regression to obtain the total sum-of-squares. That equation is simply not true. So the equation above is not right. Those two experessions compute two different values for $R^2$.

The only equation that makes sense and is (I think) universally used is:

$R^2 = 1- SS_e/SS_t$

Its value is never greater than 1.0, but it can be negative when you fit the wrong model (or wrong constraints) so the $SS_e$ (sum-of-squares of residuals) is greater than $SS_t$ (sum of squares of the difference between actual and mean Y values).

The other equation is not used with nonlinear regression:

$R^2 = SS_m/SS_t$

But if this equation were used, it results in $R^2$ greater than 1.0 in cases where the model fits the data really poorly so $SS_m$ is larger than $SS_t$. This happens when the fit of the model is worse than the fit of a horizontal line, the same cases that lead to $R^2$<0 with the other equation.

Bottom line: $R^2$ can be greater than 1.0 only when an invalid (or nonstandard) equation is used to compute $R^2$ and when the chosen model (with constraints, if any) fits the data really poorly, worse than the fit of a horizontal line.

Harvey Motulsky
  • 14,903
  • 11
  • 51
  • 98
  • Is that last point correct? Consider data in a perfect line. Now consider a model which exactly fits this line. This has SS_m/SS_t = 1. Now consider the same model but with a slightly steeper gradient. Now SS_m is slightly larger and SS_m/SS_t > 1. The model is a little worse but it still fits the data well, not "really poorly". – Denziloe Mar 12 '20 at 22:07
  • @Denziloe . Your data is perfect or nearly perfect with a positive slope. Now fit a linear regression line with the constraint that the slope be negative with a slope less than -100. The fit model will fit worse than a horizontal line, so SSe is greater than SSt. With the first equation, the R2 will be negative. With the second equation, R2 will be greater than 1. No that is not a realistic or common situation. – Harvey Motulsky Mar 13 '20 at 15:33
  • @Denziloe . The model will fit the data really poorly (worse than the null hypothesis of a horizontal line), only if you constrain the slope or intercept to a value that makes no sense. In your example, the model fits the data fine, better than a horizontal line fits. – Harvey Motulsky Mar 13 '20 at 15:34
  • Sorry, I don't really follow that as a response. In my example, SS_m/SS_t > +1 -- do you agree? And the model is a good fit -- again you agree? This would seem to contradict your statement, "R2 can be greater than 1 only when ... the chosen model fits the data really poorly". – Denziloe Mar 13 '20 at 17:11
  • @Denziloe Please send some actual data and fits, so I/we can see what you mean. – Harvey Motulsky Mar 13 '20 at 19:15
  • I feel like I have fully described the idea. But okay, here's a very simple specific instantiation of it. Data: {(-1, -1), (0, 0), (1, 1)}. Perfect model: y = x. SS_m/SS_t > 1 model: y = 1.01*x. – Denziloe Mar 13 '20 at 19:20
  • @Denziloe. SS_m is the sum of squares of the difference between the actual and predicted Y values, which is 2*.1^2=0.02. SS_t is the sum of squares of the difference between the actual Y values and the mean of all Y values (the nullH is a horizontal line at that mean), so 2.0. The ratio of SS_m_t is 0.02/2 = .01, a lot less than 1.0. – Harvey Motulsky Mar 14 '20 at 23:47
  • "SS_m is the sum of squares of the difference between the actual and predicted Y values" -- so for a perfect model, where the predicted values equal the actual values, SS_m is 0 and R2 is 0? Surely that is wrong? Refer to your answer... sounds to me like you are confused with SS_e. – Denziloe Mar 16 '20 at 00:40
4

By definition, $R^2 = 1 - SS_e/SS_t$ where both SS-terms are a sum of squares and thus nonnegative. The maximum is attained at $SS_e=0$ resulting in $R^2=1$.

AlexR
  • 739
  • 4
  • 12
  • 1
    This is not true in general, and only holds when the model variance is less than the error variance. As an example, take a linear regression without an intercept-coefficient. – Alex R. Mar 19 '18 at 19:05
  • @AlexR. See Harveys Answer (much better than mine btw) - this only applies if you use another definition of $R^2$. – AlexR Mar 19 '18 at 19:24