Is there any rule of thumb to classify $R^2$ as small, medium or large effect size?

Question

I have seen authors saying $R^2 = .13$ is moderate, but they don't mention any source.

Without any context, this value is meaningless: see http://stats.stackexchange.com/questions/13314. — whuber, May 30 '15 at 18:27
$R^2$ is not a measure of effect of anything, so talking about effect size seems not entirely correct---explanantion, say in a linear regression model $y=ax+b+\epsilon$, the effct of x is measuerd by $a$, but the $R^2$ depends not only on $a$ but also on the distribution of the $x$'s, so is not a measure of effect. — kjetil b halvorsen, May 30 '15 at 18:27
@kjetil b halvorsen, $R^2$ is often considered a reasonable effect size measure. Furthermore, it's not just used in the case of regression. Perhaps consider coefficient of determination as effect size. — John, May 30 '15 at 18:40
What makes you think a source is required to state your interpretation of the magnitude of your findings? If you've read the field, studied your data and understand your numbers you know what small, medium and large are. The only kind of source that would be good to cite would be one that specifically addressed magnitudes within the given literature. — John, May 30 '15 at 18:42
@John Can you give examples/citations where $R^2$ reasonably can be interpreted as an effect size? It would at least require the pairs $(x,y)$ to be randomly sampled! in designed experiments you can choose the desig to inflate $R^2$. — kjetil b halvorsen, May 30 '15 at 18:45
@John In every case I have seen where $R^2$ could be used as an effect size, it is really a surrogate for something that would be more meaningful. For instance, in a community that runs similar experiments with similar (or identical) statistical properties of the regressor variables and similar response variables, then $R^2$ is meaningful as a measure of the dispersion of the residuals. As soon as anyone deviates from such a community norm experiment, though, it becomes problematic (or even nonsensical) to compare two such $R^2$ values. — whuber, May 30 '15 at 19:06
@whuber, I never said there are any cases where it's the best one. I'd never use it or endorse it. — John, May 30 '15 at 19:13
@kjetil b halvorsen, you just gave your own good arguments and examples for and against. The question author does not once mention regression. This could very well be correlation. — John, May 30 '15 at 19:17
Seber "Linear Regression Analysis" quotes Tukey(1954) "correlation coefficients are justified in two and only two circumstances, where they are regression coefficients, or when the measurement of one or both variables on a determinate scale is hopeless." — kjetil b halvorsen, May 30 '15 at 20:09
A large $R^2$ is whatever is publishable as such in your field. — Nick Cox, Oct 16 '15 at 18:17

score 6 · Answer 1 · answered Oct 16 '15 at 16:00

The reference you are looking for comes from the behavioral sciences. Cohen (1988) proposed 'small', 'medium', and 'large' magnitudes for $R^2$, standardized mean differences (Cohen's d), and bivariate correlations (Cohen's r), among other measures. The proposed values obviously don't come from thin air, there is a justification for them, but Cohen himself explains these are just very general definitions not set in stone and that specific subject matter also weighs in determining what a relevant effect size is. Specifically for $R^2$, as per pp. 413-414 of the book, the proposed 'small', 'medium' and 'large' values are 0.02, 0.13, and 0.26, respectively.

Reference: Cohen J. (1988). Statistical Power Analysis for the Behavioral Sciences, 2nd Ed. Hillsdale, NJ: Laurence Erlbaum Associates

score 5 · Answer 2 · answered May 30 '15 at 18:20

It depends on the context.

Sometimes a process is pretty deterministic and signal is strong relative to noise. Then even a relatively simple benchmark model will have an $R^2$ as high as, say, 0.80. An example could be modelling people's weight given their height, age and gender.

Other times there is not that much signal and there is a lot of randomness. Then even a sophisticated model may not achieve an $R^2$ as low as, say, 0.20. An example could be modelling the daily movements of stock prices given the prices' own histories (and maybe some other variables).

So it really depends.

score 1 · Answer 3 · edited Oct 16 '15 at 18:14

1

I agree with Richard Hardy but would like to add that it also depends on the penalty or cost of an error in the context of your model. For example, a model with lower $R^2$ may have less costly errors for your purposes than a model with higher $R^2$ but more costly errors. In other words, $R^2$ needs to be considered in the context of what you are trying to explain with your model.

edited Oct 16 '15 at 18:14

Nick Cox

48,377
8
110
156

answered Oct 16 '15 at 16:49

Frank H.

566
1
4
16

2

Answers can move around in a thread so references to answers "above" or "below" are too fragile. I have taken the liberty of editing that out. – Nick Cox Oct 16 '15 at 18:15

Is there any rule of thumb to classify $R^2$ as small, medium or large effect size?

3 Answers3