Diebold-Mariano with multiple predictions over time

Question

I've been trying to use a DM test on my predictive model. I have two different models. Each model is calibrated by using a MSE loss function to estimate the price of 374 different assets each day. I have calibrated my models for 300 days and now I want to test the accuracy of my two models. I have tried using the pricing error of each assets, giving me 374*300=112 200 pricing errors for each model, as input into the DM-test where the first 374 errors is from the first day, the following 374 errors is from the second day and so on. This approach gives me quite unfeasible results since the sample size gets quite big. I have also tried setting my forecast horizon to my number of assets 374 with some improvements but I still think the results are faulty.

Questions:

Can I apply my DM-test to my setting or do I have to rethink my approach of testing pricing accuracy?

Maybe I'm confused on how to apply a correct forecast horizon to my setting? Any help would be highly appreciated.

score 0 · Answer 1 · answered Aug 11 '17 at 18:36

The Diebold-Mariano test considers two random variables $e_1$ and $e_2$ that generate pairs of forecast errors $(e_{1,t},e_{2,t})$. Given a sample of $T$ realization of these pairs (i.e. $t=1,\dots,T$), the test assesses whether the expectations of some function (e.g. absolute value or square) of $e_1$ and $e_2$ are equal, i.e. whether $\mathbb{E}|e_1|=\mathbb{E}|e_2|$ or $\mathbb{E}(e_1^2)=\mathbb{E}(e_2^2)$.

In your case, the important thing is what loss function is relevant for you. Would you treat each asset individually or somehow lump them together? I think the latter may be more convenient, if your are fine with ignoring potential differences between assets.

Then you can consider a loss function $L$ such as the sum of absolute values of individual errors across the assets; or the sum of squares. $e_{1,t}$ and $e_{2,t}$ would be vector-valued; each element of the vector would come from a different asset, thus the vector length would be 374. The loss function would transform the vector into a scalar. Thus you would have 300 observations of pairs of losses $(L(e_{1,}),L(e_{2,1}));(L(e_{1,2}),L(e_{2,2}));\dots;(L(e_{1,300}),L(e_{2,300}))$. From this point on applying the Diebold-Mariano test would be straightorward.

Alternatively, you could consider each asset individually and run 374 separate Diebold-Mariano tests. But then you would have a multiple testing problem which should be taken care of.

Diebold-Mariano with multiple predictions over time

1 Answers1