5

I'm working on simple linear regression, and I would like to understand the relationship between RMSE and RSS (residual sum of squares).

In another Stackexchange question, I found some explanations, but they didn't directly explain the answer to my particular question, and definitely not in a way I could understand.

What is the relationship between RMSE and RSS in linear regression?

trevor
  • 163
  • 1
  • 1
  • 4

2 Answers2

13
  • The RSS is the sum of the square of the errors (difference between calculation and measurement, or estimated and real values):

$ RSS = \sum{(\hat Y_i-Y_i)^2} $

  • The MSE is the mean of that sum of the square of the errors:

$ MSE = \frac{1}{n}\sum{(\hat Y_i-Y_i)^2}$

  • The RMSE is the square root of the MSE:

$ RMSE = \sqrt{MSE} $

A bit of math shows:

$ RMSE = \sqrt{MSE} = \sqrt{\frac{1}{n} \cdot RSS} $

You can check it in the example that you posted:

$ RMSE = \sqrt{\frac{1}{32} \cdot 447.6743} = 3.740297 $

Note that for the mtcars dataset $n=32$.


Also see this question

Luis
  • 433
  • 4
  • 15
3

Having the mathematical derivations, you might ask yourself why use one measure over the other to assess the performance of a given model? You could use either, but the advantage of RMSE is that it will come out in more interpretable units. For example, if you were building a model that used house features to predict house prices, RSS would come out in dollars squared and would be a really huge number. RMSE would come out in dollars and its magnitude would make more sense given the range of your house price predictions.

jklaus
  • 146
  • 2