How to compare the performance of ARIMA and LSTM for time series forecasting?

Question

I am facing some challenges in comparing LSTM and ARIMA for soma datasets.

I would like to know if there are some general expectations regarding the differences between ARIMA and LSTM regarding how they deal with outliers. In general, LSTM deal better with outliers tham ARIMA? Or is not possible to say that for the general case?

Besides that, I read some papers that state that differences in MAE and RMSE, in general, are due to outliers in forecasting. Is it possible to measure in some way the degree of such impact of outliers, by considering the MAE and the RMSE?

What can I conclude if model A is better than model B regarding MAE, but model B is better than model B regarding RMSE?

score 2 · Answer 1 · answered Sep 23 '21 at 15:42

What can I conclude if model A is better than model B regarding MAE, but model B is better than model B regarding RMSE?

It does not make sense to compare the same predictions from different models using different accuracy metrics, because different metrics elicit different functionals from the (frequently only implicit) predictive distribution:

If you want to minimize the RMSE, you extract the predictive expectation.
If you want to minimize the MAE, you extract the predictive median.

Now, if you assume a symmetric future density, like a normal, then of course the two functionals will coincide. But then, the two error measures should carry the same information. So if they tell you different stories, that is a strong indication that your density is not symmetric. So you are back to first deciding which functional you want and only then deciding on the appropriate error measure.

More information can be found in Kolassa (2020, International Journal of Forecasting). You may also be interested in What are the shortcomings of the Mean Absolute Percentage Error (MAPE)?

score 1 · Answer 2 · answered Sep 23 '21 at 15:24

I'm not aware of any general framework for comparing the sensitivity to outliers as this usually depends on under/overfitting. I would say that ARIMA will typically not perform well at forecasting values outside of the previously observed range. See a blog post here for some comments on that topic.

The other part of your question relates to metrics that will capture sensitivity to outliers. Some comparable metrics which differ by sensitivity to outliers:

Pearson vs Spearman correlation: Spearman only worries about getting the ordering of predictions correct (i.e. the biggest target value results in the biggest prediction) but doesn't consider the actual values. As such Pearson will be more affected by incorrectly predicted outliers.
Mean vs Median Absolute Errors: Clearly the Mean absolute error will be impacted by an outlier as it contributes more to the mean than the other points. However a single outlier won't significantly affect the median error.

AI also doesn’t work well extrapolating, that’s why it needs so much data. — Aksakal, Sep 23 '21 at 15:50

How to compare the performance of ARIMA and LSTM for time series forecasting?

2 Answers2