41

For regression problem, I have seen people use "coefficient of determination" (a.k.a R squared) to perform model selection, e.g., finding the appropriate penalty coefficient for regularization.

However, it is also common to use "mean squared error" or "root mean squared error" as a measure of regression accuracy.

So what is the main difference between these two? Could they be used interchangeably for "regularization" and "regression" tasks? And what are the main usage of each in practice, such as in machine learning, data mining tasks?

chl
  • 50,972
  • 18
  • 205
  • 364
dolaameng
  • 513
  • 1
  • 5
  • 5

1 Answers1

46

$R^2=1-\frac{SSE}{SST}$, where $SSE$ is the sum of squared error (residuals or deviations from the regression line) and $SST$ is the sum of squared deviations from the dependent's $Y$ mean.

$MSE=\frac{SSE}{n-m}$, where $n$ is the sample size and $m$ is the number of parameters in the model (including intercept, if any).

$R^2$ is a standardized measure of degree of predictedness, or fit, in the sample. $MSE$ is the estimate of variance of residuals, or non-fit, in the population. The two measures are clearly related, as seen in the most usual formula for adjusted $R^2$ (the estimate of $R^2$ for population):

$R_{adj}^2=1-(1-R^2)\frac{n-1}{n-m}=1-\frac{SSE/(n-m)}{SST/(n-1)}=1-\frac{MSE}{\sigma_y^2}$.

ttnphns
  • 51,648
  • 40
  • 253
  • 462
  • 2
    I thought MSE is the avg of the errors, which means MSE = SSE/n, on what occasions do we use MSE=SSE/(n-m)? Please explain. Thanks – Sincole Brans Jul 25 '14 at 06:52
  • @SincoleBrans Please see http://en.wikipedia.org/wiki/Mean_squared_error, section "Regression". – ttnphns Jul 25 '14 at 12:12
  • I'm a bit confused. The results in https://martin-thoma.com/regression/ show that a model can be good (compared to some other models) with R^2, but at the same time bad with MSE. Could you explain that? – Martin Thoma Apr 17 '18 at 07:03