17

RMSE (Root mean square error) and SD (Standard deviation) have similar formulas.

This link says

The only difference is that you divide by $n$ and not $n−1$ since you are not subtracting the sample mean here. The RMSE would then correspond to $\sigma$ . Therefore, the population RMSE is $\sigma$ and you want a CI for that.

So I want to know whether RMSE and SD are the same. Also, I want reference about it.

JH.Kim
  • 191
  • 1
  • 2
  • 5

1 Answers1

16

TLDR; While the formulas may be similar, RMSE and standard deviation have different usage.

You are right that both standard deviation and RMSE are similar because they are square roots of squared differences between some values. Nonetheless, they are not the same. Standard deviation is used to measure the spread of data around the mean, while RMSE is used to measure distance between some values and prediction for those values. RMSE is generally used to measure the error of prediction, i.e. how much the predictions you made differ from the predicted data. If you use mean as your prediction for all the cases, then RMSE and SD will be exactly the same.

As a sidenote, you may notice that mean is a value that minimizes the squared distance to all the values in the sample. This is the reason why we use standard deviation along with it -- they are related species!

Tim
  • 108,699
  • 20
  • 212
  • 390
  • 1
    @Chill2Macht it is not about spread vs distance, but about spread of single variable vs distance between predicted and true values. – Tim Jun 27 '17 at 13:40
  • That does make a lot of sense actually. If we considered the distance between the predicted and true values to be a variable itself, would it be appropriate in your opinion to say that the RMSE is also a measure of the spread of that variable? "Spread of the distance"? – Chill2Macht Jun 27 '17 at 13:59
  • 2
    @Chill2Macht but then you *will* be calculating standard deviation. – Tim Jun 27 '17 at 14:12
  • 4
    @Chill2Macht RMSE is not sd of errors. Sd(errors) = mean((errors - mean(errors))^2) while rmse = mean(errors^2) – Tim Jun 27 '17 at 16:47
  • That hits the nail on the head -- thank you for your time. – Chill2Macht Jun 27 '17 at 17:59
  • 2
    Worth noting that as the mean error approaches 0 and n approaches infinity sd and rmse converge. – Morgan Ball Aug 15 '17 at 09:18
  • 1
    @Tim I think you're missing the square root. It should be Sd(errors) = square root( mean((errors - mean(errors))^2)) – Tom Rijntjes Nov 25 '19 at 12:50
  • Hey Tim! Just wondering whether it would be simpler to say that what the standard deviation is to the mean, is the root mean square error to the regression line? – Stefan Aug 15 '20 at 17:37