I am currently doing my college final project. I forecasted national soybeans yield and used MAPE to calculate the in-sample and out-of-sample forecasting accuracy. The MAPE results showed that the in-sample forecasting accuracy is higher than the out-of-sample accuracy. Does in-sample forecasting results always supposed to have a higher accuracy than the out-of-sample forecasting accuracy? and why is that? I can not seem to find the explanation in my text book, so if you happen to know about in-sample and out-of-sample forecasting accuracy, please help me >< thank you so much!^^
Asked
Active
Viewed 159 times
1 Answers
0
In-sample accuracy is almost always higher than out-of-sample accuracy, simply because in-sample is the data you are fitting your model to. So this is no reason for concern. (To the contrary, if your out-of-sample accuracy were higher than in-sample, I would wonder whether something is broken.)
This is part of the reason why in-sample accuracy is very rarely reported on. It's not a good guide to out-of-sample accuracy, and optimizing in-sample accuracy can lead to overfitting.
You may find Forecasting: Principles and Practice (2nd ed.) by Athanasopoulos & Hyndman and Forecasting: Principles and Practice (3rd ed.) by Athanasopoulos & Hyndman interesting reading.

Stephan Kolassa
- 95,027
- 13
- 197
- 357
-
Thank you so much for answering my question^^ But, I am still having trouble understanding your explaination on part 'simply because in-sample is the data you are fitting your model to'. If you don't mind, can you explain more about this? And if the out-of-sample accuracy were higher, does it means that the model is overfitting? Thank you! – adin Jun 29 '21 at 11:28
-
When you train (or fit, whatever term you prefer) your model, you are fitting it *to the training data*, and you are explicitly aiming for a good fit. Conversely, you don't fit it to the (unseen) testing data. So your fitting will naturally lead to a higher accuracy in-sample than out-of-sample. To your second point, overfitting results in *much* lower accuracy out-of-sample than in-sample. Higher accuracy out-of-sample than in-sample will usually only occur if you have data leakage. – Stephan Kolassa Jun 29 '21 at 11:53
-
Thank you so much for your explanation, it really helps a lot! I hope you have a good day and stay safe!^^ – adin Jun 29 '21 at 13:11