How to interpret the direct comparison of Continuous Rank Probability Score (CRPS) and Mean Absolute Error (MAE)?

Question

Say I have a trained Random Forest (RF) consisted of $m$ decision trees and I am interested to estimate $y$ from $t_1$ to $t_n$. The good thing about RF is that I have an ensemble of estimators and a deterministic estimator in one place.

After using RF for estimation, I calculate the MAE for RF, and I calculate the CRPS for ensemble members of RF: $m$ decision tree regressors.

CRPS is a probabilistic measure that seeks to evaluate the accuracy of the ensemble, and MAE targets the same quality but for the deterministic model.

Is there any point in comparing these two metrics? What could be the intuition behind say: my CRPS is slightly smaller than MAE in this case.

I don't have any statistical muscle, but my gut feeling is that this is a vague comparison: Accuracy of an ensemble vs accuracy of single model. They are not meant to compete, they don't belong to the same world. ensemble is targeting to resemble the distribution of y, while the deterministic model is after the average of y.

I hope I could express my confusion in a not confusing way.

score 0 · Answer 1 · answered Feb 11 '21 at 07:47

0

You are mistaken. The CRPS ("Continuous Ranked Probability Score") is an example of proper scoring-rules. It evaluates a full predictive density.

But your $m$ trees do not output a full density! Instead, each separate tree gives you its best forecast for a single point prediction (which will depend on the loss function it minimizes, so it will probably be an estimate of the future expected value, cf. Kolassa. 2020, IJF). So the variation you see in your $m$ tree forecasts is not an estimation of the future variability in the observables, but the variability in the (unobservable!) mean!

Thus, it doesn't make sense to apply the CRPS to the ensemble point predictions of your $m$ trees in the first place, because these do not form a density forecast for the observables!

answered Feb 11 '21 at 07:47

Stephan Kolassa

95,027
13
197
357

Thank you Stephan for the answer and the reference. I don't quite grasp it though. At each time step $t$ we have $m$ different estimates. Why cant they be seen as a PDF/CDF for $y$? I mean I thought that's the whole point of ensemble modelling. To have a distribution instead of a single deterministic estimate. And second thing, CRPS is calculated for each time step, to compare this distribution with the one constructed from the single observation. And then it is averaged over time. Do you think it is flawed this way? – Alireza Amani Feb 11 '21 at 16:34
The key question is: a distribution *of what*? Your ensemble gives (an estimate of) a distribution for the unobservable mean $\mu=E(y)$. The CRPS evaluates a density prediction of $y$. The difference is precisely the same as between a confidence interval (which your ensemble gives, for the mean), and a prediction interval (which can be derived from a true density forecast, i.e., one for the observations). [See here.](https://stats.stackexchange.com/tags/prediction-interval/info) – Stephan Kolassa Feb 11 '21 at 18:38
Lets say $y$ is the air temperature and we use humidity ($x_1$) and wind speed ($x_2$) as two features to predict $y$. We are not trying to forecast a time series. So $t$ is merely an index here. To answer your question: at index $t_1$, $x_1$ and $x_2$ are within some range. The $m$ estimates for this situation (time index) are trying to resemble the distribution of $y$ when $x_1$ and $x_2$ have some specific values or within some specific range. – Alireza Amani Feb 12 '21 at 02:52
This discussion has nothing at all to do with whether we are forecasting a time series. Everything applies equally whether we have a time series or some other prediction problem. Since you are writing about *the distribution of $y$*, you are looking for prediction intervals, or a density prediction. As above, collecting the point predictions from the $m$ trees won't help you, since each one will have abstracted away the residual noise. I recommend [How do I calculate prediction intervals for random forest predictions?](https://stats.stackexchange.com/q/49750/1352) Does that help? – Stephan Kolassa Feb 12 '21 at 07:15
It most definitely help! thanks a lot for the help. In my field of study and the peer-reviewed papers, I see CRPS used in multi-model forecasting a lot. The difference being they use dynamical models and I want to use Random Forest. Exactly as i described, thinking of point forecasts of each model as an empirical way to construct a PDF for $y$ at each time step. Now this discussion helps me to be skeptic and question. – Alireza Amani Feb 12 '21 at 16:10

How to interpret the direct comparison of Continuous Rank Probability Score (CRPS) and Mean Absolute Error (MAE)?

1 Answers1