Ways to show how well or badly predicted data reflect observed data

Question

I have observed rainfall data and would like to know how well they represent the predicted data. My first idea was to show the data on a scatter plot. Plotting x =observed vs y = predicted I set the axis to equal length as well as that the plots are square sizes.

Then I thought I am showing the Pearson correlation coefficient; however I am not sure what are the best ways to show how good my data are.

What other ways exist to assess such data? What would be a sound way to represent the fit of the data and represent it to other? Any creative ideas?

In practice Pearson correlation might be useful here but in principle it doesn't measure agreement. Consider as a knockdown case, predicted $=$ positive constant $\times$ observed. Here Pearson correlation is identically 1, always, but the larger the constant $>$1, the poorer the fit. Otherwise and more concisely put, Pearson correlation measures linearity, not agreement. Concordance correlation is a measure of agreement. — Nick Cox, Nov 13 '17 at 18:34
Some edits of language, but it's not totally clear what your situation is. Observed versus measured: what's the distinction there? Or is it observed versus predicted, and how were predictions obtained? — Nick Cox, Nov 13 '17 at 18:50
Thanks a lot to all who took their time and giving those good answers. It already made certain things clearer. I do want to show how good the agreement between those two data sets are. — Mr.Man, Nov 14 '17 at 07:37
See also: https://stats.stackexchange.com/questions/142585/which-rmse-normalisation-approach-to-use — Sal Mangiafico, Sep 03 '19 at 12:51

Sal Mangiafico · Answer 1 · 2019-09-03T12:52:48.287

Plotting the predicted and actual values is always a good idea to see if there is some systematic error in the predictions.

Often people will use a measure of accuracy or error (list copied from here. Caveat: I' am author of this page.):

• Minimum maximum accuracy

• Mean absolute percent error (MAPE)

• Root mean square error (RMSE)

• Normalized root mean square error (NRMSE)

Accuracy measures tend to be reported as a percent or proportion, and so are unit-less. With 1 being a perfect fit. In my experience, they tend to be high (closer to 1) compared with r-squared values, so you have to be cautious in interpretation.

MAPE is reported as a percent error, so unitless, with 0 being a perfect fit. RMSE retains the units of the measured variable; again, 0 would be a perfect fit.

Ways to show how well or badly predicted data reflect observed data

1 Answers1