Evaluating predicted vs observed - RMSE vs. Pearson's R interpretation

Question

I'm evaluating the error in three cross-validated models plotting observations against predictions. To do so, I'm comparing the RMSE (root-mean-squared-error) and the Pearson's R between predictions and observations.

(Note: negative binomial models, sample n = 49, mean = 13.33 and SD = 17.27)

The results for the RMSE are 18.81, 18.97, and 17.48, respectively. Pearson's R are 0.10, 0.09, and 0.33.

How can I interpret this huge difference (~70%) in correlation values but with only minor changes (~10%) in RMSE? Am I right if I say that the third model performs much better in predicting extreme values than the other two? Essentially, I understand that in a prediction based on the mean the correlation would be 0 but the RMSE may not be so high due to over-predictions compensating under-predictions (?). Is there any other alternative?

If I'm right about you implying $R^2$ in the question instead of Pearson's $r$, please consider updating it correspondingly. — Aleksandr Blekh, Jan 05 '15 at 04:04
Related: [How to interpret error measures in Weka output?](http://stats.stackexchange.com/questions/131267/how-to-interpret-error-measures-in-weka-output) — Tim, Jul 20 '15 at 07:36

score 2 · Answer 1 · edited Apr 13 '17 at 12:44

I haven't seen using Pearson's $r$ as statistic, used in determining quality of a predictive model. Perhaps, you mean that you use $R^2$ for that. If that's the case, the differences in interpretation between RMSE and $R^2$ are discussed here. It's also my understanding that using adjusted $R^2$ is preferable to plain $R^2$ (http://www.theanalysisfactor.com/assessing-the-fit-of-regression-models):

One pitfall of R-squared is that it can only increase as predictors are added to the regression model. This increase is artificial when predictors are not actually improving the model’s fit. To remedy this, a related statistic, Adjusted R-squared, incorporates the model’s degrees of freedom. Adjusted R-squared will decrease as predictors are added if the increase in model fit does not make up for the loss of degrees of freedom. Likewise, it will increase as predictors are added if the increase in model fit is worthwhile. Adjusted R-squared should always be used with models with more than one predictor variable. It is interpreted as the proportion of total variance that is explained by the model.

Recently, I have also seen a recommendation to use the predicted residual sum of squares statistic instead of $R^2$ as a measure of predictive quality (power) of a regression model: http://www.analyticbridge.com/profiles/blogs/use-press-not-r-squared-to-judge-predictive-power-of-regression.

Additionally, I'm not sure that it makes sense to compare RMSE and $R^2$ in general, as the former is an absolute measure of fit, while the latter is a relative one (the statistics' scales are different).

"Additionally, I'm not sure that it makes sense to compare RMSE and R2 in general, as the former is an absolute measure of fit, while the latter is a relative one (the statistics' scales are different)." What about the normalized RSME? See: https://en.wikipedia.org/wiki/Root-mean-square_deviation Moreover, the mean absolute error (https://en.wikipedia.org/wiki/Mean_absolute_error) can also be normalized and seems to be a bit more intuitive. — Cord Kaldemeyer, Jul 19 '18 at 16:18
I suspect the OP is correlating the predicted values to the actual values as a measure of model fit. — Sal Mangiafico, Jul 20 '18 at 13:35

score 0 · Answer 2 · answered Oct 11 '14 at 20:49

"the third model performs much better in predicting extreme values" - the third model is better in predicitng the values, to be precise. this provides the relationship between the two http://www.mathworks.com/matlabcentral/answers/36351-relationship-between-rmse-and-r-2 Since RMSE depends on the range of values in your data, there is no reason to expect the change in RMSE to be the same as the change in R2. I am unsure about the last part of your question.

Cord Kaldemeyer · Answer 3 · 2018-07-20T08:13:25.160

From what I have read, with Pearsons r the relation r^2=R^2 is only valid for linear relations which is basically what r delivers: a measure for the linear relation between two variables.

R^2 measures the proportion of variability in Y explained by the regression model
(N)RMSE measures the standard deviation of the residuals

I understand (N)RMSE or alternatevily the more intuitive (N)MAE -both can be normalized- more as a measure of the general disperson but not generally for the kind of relation between two variables. So both metrics are either complementary and not substituting each other.

See also this article and answer which from my point of view should be the accepted one.

score 0 · Answer 4 · answered Jul 20 '18 at 13:33

Some different ways to assess model accuracy or error include min-max-accuracy, MSE, RMSE, NRMSE, MAE, and MAPE. I'll also include Efron's pseudo r-squared here. One thing to consider is if the value is unitless, units of the predicted variable, or its square.

In the case of under-predictions and over-predictions with RMSE, RMSE increases for each; it's not the case that one type of error "compensates" for the other.

As to your question about predicting extreme values, why not plot the predicted values and the actual values, maybe with a 1-to-1 line?

I'm not sure about using the correlation of predicted and actual values for this purpose. It seems to me it would have a weakness in that you could get a high correlation if the model consistently under-predicted or over-predicted the values.

Evaluating predicted vs observed - RMSE vs. Pearson's R interpretation

4 Answers4

Linked