I was reading about regression metrics in the python scikit-learn manual and even though each one of them has its own formula, I cannot tell intuitively what is the difference between $R^2$ and variance score and therefore when to use one or another to evaluate my models.
Asked
Active
Viewed 2.0k times
2 Answers
8
$R^2 = 1- \frac{SSE}{TSS}$
$\text{explained variance score} = 1 - \mathrm{Var}[\hat{y} - y]\, /\, \mathrm{Var}[y]$, where the $\mathrm{Var}$ is biased variance, i.e. $\mathrm{Var}[\hat{y} - y] = \frac{1}{n}\sum(error - mean(error))^2$. Compared with $R^2$, the only difference is from the mean(error). if mean(error)=0, then $R^2$ = explained variance score
Also note that in adjusted-$R^2$, unbiased variance estimation is used.

SultanOrazbayev
- 103
- 4

Dean
- 112
- 1
-
2sklearn doesn't have adjusted-R2 does it? – Hack-R Jun 08 '17 at 15:05
-
@Hack-R actually [it have](https://scikit-learn.org/stable/modules/model_evaluation.html#r2-score-the-coefficient-of-determination) – mMontu Dec 24 '18 at 17:33
-
@mMontu That is R2. – vasili111 Feb 25 '21 at 17:26
-
@Hack-R For adjusted R2 see this: https://stackoverflow.com/questions/51023806/how-to-get-adjusted-r-square-for-linear-regression and https://stackoverflow.com/questions/49381661/how-do-i-calculate-the-adjusted-r-squared-score-using-scikit-learn/49381947 – vasili111 Feb 25 '21 at 17:27
1
Dean's answer is right.
Only I think there is a minor typo here: $Var[\hat{y}-y]=sum(error^2-mean(error))/n$.
I guess it should be $Var[\hat{y}-y]=sum(error-mean(error))^2/n$.
My reference is the source code of sklearn here:https://github.com/scikit-learn/scikit-learn/blob/bf24c7e3d/sklearn/metrics/_regression.py#L396

Siong Thye Goh
- 6,431
- 3
- 17
- 28

Seraph
- 11
- 1