Model selection and parameter estimation in forecasting with a Dynamic Linear Model

Question

I am implementing a general purpose prediction tool for time series. I want to tolerate missing values, so I decided to settle for DLMs. To make it as relevant as possible on a large number of datasets, I want it to try several different models and select the best parameters. It would then operate the prediction with the one that fits the best. This should allow me to extract as many relevant patterns as possible to make the forecasting as relevant as possible.

Here is my interrogation: in most papers, meaning all the sources I've read until now, they want to use likelihood and and other similar criteria like AIC. This doesn't seem optimal in the case of forecasting. You evaluate if your model fits statistically, but that doesn't tell you if it will predict well future values. I think it would make way more sense to assess models on their predictive power within the data set. For instance you could compute the mean square error between predictions at all intermediate time points and actual realizations of your time series. This is possible thanks to the recursive nature of a DLM. With this technique you wouldn't take the risk of over-fitting, as you are assessing forecasting capacity and forecasting capacity is what you are looking for. Do you see any reasons why I would be wrong ? Why is everybody using maximum likelihood ? Have you seen any references that use something close to what I'm suggesting ?

score 2 · Accepted Answer · answered Aug 06 '14 at 02:30

No, I don't see any reasons why that is wrong. That is pretty standard for evaluating the predictive power of a DLM. Other derivatives of the same concept include the Mean Absolute Percentage Error and Mean Absolute Deviation. One measure that I find particularly helpful is Theil's U: $$\sqrt{\frac {\sum_{t=2}^n(y_t - f_t)^2} {\sum_{t=2}^n(y_t - y_{t-1})^2}}$$ Simply, it is a measure of your model's predictive power versus a "no-change" model, i.e., that is a model that simply predicts that the next value will be the same as the last. You could supplant the "no-change" model with another simple model (e.g., a SARIMA model using Hyndman's R package forecast).

While I haven't seen them used, I'm not sure why you couldn't also use the median versions of the MAPE, MAD, or MSE.

Model selection and parameter estimation in forecasting with a Dynamic Linear Model

1 Answers1