How do you determine that your timeseries forecasting model is good enough?

Question

Pardon me, I am new to timeseries forecasting. Given that there is not always a clear cut way to know whether your forecasting model is good enough and there's a significant degree of subjectivity in measuring this or even defining what "good enough" means, I thought it would be interesting and educative to find out what people do in practice.

What are the modelling / quantitative criteria that you use to determine that you have a good enough timeseries forecasting model in practice?

I define a model that's good enough as one that produces reasonable enough forecasts of a timeseries in practise. Perhaps the question should be: what are the modelling/quantitative criteria that you use to determine that you have a model whose forecasts you believe to be reasonable? Are there certain things you would not accept for your forecasting model (e.g. correlated residuals) - what are they and why?

(You may assume that you have a good idea of what the regressors are and you have the future values for them)

cross validation. it's not a panacea but that's the only way to assess how the model predicts out of sample. when applied correctly, CV is the only tool you have — Aksakal, Jun 12 '20 at 20:04
Strongly related: [How to know that your machine learning problem is hopeless?](https://stats.stackexchange.com/q/222179/1352) — Stephan Kolassa, Jun 12 '20 at 20:10
Does this answer your question? [How to know that your machine learning problem is hopeless?](https://stats.stackexchange.com/questions/222179/how-to-know-that-your-machine-learning-problem-is-hopeless) — mdewey, Jun 13 '20 at 13:02
@StephanKolassa, thank you. I am not 100% sure that the linked post is much related to this post. The linked post can be interpreted to talk about when to abandon a project/model. My post talks about the opposite - when to accept a model. — Newwone, Jun 13 '20 at 16:06
@mdewey, thank you. No, it doesn't. See my previous comment - to Stephan. The post you linked is talking about the opposite of what I am talking about. You may think of my question as: "How to know that your timeseries forecasting model is reasonable" — Newwone, Jun 13 '20 at 16:07
All, it looks like my original post has been misinterpreted possibly because I didn't word it properly - English is not my first language. Please pardon me. I have now hopefully worded it better. — Newwone, Jun 13 '20 at 16:09
If you are looking for a model "that produces reasonable enough forecasts", that looks to me exactly like figuring out when you can't improve the model any more. "Reasonable enough" depends on what your time series is and on what you know about your time series. As such, it seems to be exactly the question that I linked to. — Stephan Kolassa, Jun 13 '20 at 16:30
@StephanKolassa, ok, let me try to present my question this way - if you had a forecasting model that you know somehow is a good model (in your opinion), what are the quantitative proofs/evidence/tests etc that you would present to me to back your claim? — Newwone, Jun 13 '20 at 16:48
Unfortunately, I believe there are none. There is no proof that you *can't* improve your forecast. (There *is* a proof you *can* improve it: simply forecast better.) This is one of the big challenges in my job: customers are unsatisfied with the forecast quality and are convinced that it *must* be possible to improve it. "Use cannibalization!" "Have you tried ML?" "Have you applied Prophet?" And if all these don't help, then there is little you can do. Of course, saying so is hard to distinguish from incompetence on my part. — Stephan Kolassa, Jun 13 '20 at 17:55
@StephanKolassa, I see your point. At the same time, I am thinking there must be proofs/evidence you can produce - for example, showing that the model in-sample outputs are not too far off from the observed values (which is mentioned in the answer below). I agree this is making an assumption that the model will predict well (out of sample). Aksakal mentioned cross validation. What about residuals analysis (looking at correlation etc), etc? What else could the forecaster use to back his/her view that his model is ok? — Newwone, Jun 13 '20 at 18:22
In-sample fit is very misleading, because it will always induce you to overfit. Do not use in-sample fit. Much better to look at a [holdout sample](https://otexts.com/fpp2/accuracy.html), which is sometimes called "time series cross validation", and @Aksakal may be referring to this. Yes, serial correlation in (holdout) errors is usually a sign the model can be improved. But at some point, you have a holdout forecast quality and no way to argue this is the best you can do, except for what I described in that answer linked to. — Stephan Kolassa, Jun 14 '20 at 07:25
In-sample fit is important to me: if a model cant fit well in-sample, it won't work out of sample either. However, in-sample fit alone is not a good measure of model quality. For instance, a regression model will always catch the mean in-sample, and the mean is your main worry because the biggest problems in forecasting come from mean shifts. Hence, a good in-sample fit can give you a false sense of security with a model. Time series cross-validation is really your main guide, but as I noted before it is important that you use it properly, otherwise it will be no different than in-sample fit — Aksakal, Jun 14 '20 at 14:36
@Aksakal, would you mind expanding on how to use CV properly in order to determine model quality? Or shall I post it as a new question? — Newwone, Jun 14 '20 at 16:30
@Newwone this video is not bad https://youtu.be/uoTBdCODGvk?t=622 — Aksakal, Jun 14 '20 at 17:02

score 0 · Answer 1 · answered Jun 12 '20 at 19:50

If you are using R, you can use the predict function (https://stat.ethz.ch/R-manual/R-devel/library/stats/html/predict.lm.html) to compare your model's predicted values to the actual values.

Of course, if your model is designed as a forecasting tool, you may not be able to assess the future "goodness of fit" currently, but you should still be able to apply the predict function to data where the values of the response variable are known.

Other common measures of fit include RMSE, R-squared, and MAE, all of which can be drawn from the postResample function in caret. Link here: https://www.rdocumentation.org/packages/caret/versions/2.27/topics/postResample.

As you mentioned, autocorrelation is another problem to consider in evaluating time series models. You can use the acf function to quantify and visualize autocorrelation (https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/acf). Corrections to autocorrelation include robust standard errors, and the inclusion of lag terms.

score 0 · Answer 2 · answered Jun 16 '20 at 21:59

I do this for a living and so its important to me.:) However, I am a data analyst rather than a statistician so my answer might vary from a statistician's. The way I assess it if my model is good enough is two fold. First, I track a percent difference in each month and a year to date (the year is what really matters to us). My rule of thumb is that five percent error is acceptable given our uncertain process, but each person has to make that decision themselves. I don't think there is an objective way to decide that which is widely accepted and it also depends on how certain and unchanging your process is. Second, I have tried to find (it is not easy) what the rate of error is for others in my area. That gives me a benchmark to compare my results to.

How do you determine that your timeseries forecasting model is good enough?

2 Answers2

Linked