1

General

Dear community,

I really struggle with some imporant issues for my next project. In general, the investigation is about multi-response forecasting with financial data. The predicability of the multi-responses will be tested in-sample as well as out-of-sample. Furthermore, hypothetical investment strategies will be derived based on the forecasts.

Let's say i have a set of time series data, which basically contains observations of returns as a result of equity strategies (e.g. Fama & French factor returns). I want to investigate the predictability of "future" observations for each time series with a linear model. Moreover, lets say I have already identified appropriate predictors. I know that all responses are somehow correlated. My hope is to take advantage of these correlation pattern among the responses with the objective of ending up with more precise forecasts (in the out-of-sample setting), in comparison to "only" using the set of exogenous features.

In a nutshell, I think I have identified the following possibilites to incorporate multi-response correlation into analysis:

  1. multivariate predictive regressions
  2. using endogenous features (e.g. VARX-model)
  3. Curds & Whey (https://www.stat.berkeley.edu/~breiman/curds-whey-justtext.pdf)

Attempt 1: incorporation of correlations due to multivariate evaluation?

Lets assume that we are only interested into one-month ahead forecasts. Thus, the univariate multiple regression approach, for each response ($q={1,2,3 ...,Q}$), could be done as follows:

$r^{}_{t+1,q}=a+b^{}_{1}*x{}_{t,1}+...+b{}_{p}*x{}_{t,p}+error^{}_{t+1,q}$

In contrast, instead of evaluating the same regression for each response ($q$), we could simultaneously evaluate the results with a multivariate regression and matrices.

$R^{}_{t+1}=B*X{}_{t,1}+Error^{}_{t+1}$

So my question for this attemp is, what differences can i expect? I read that the multivariate model accounts for response-correlation (Why do we need multivariate regression (as opposed to a bunch of univariate regressions)?), but what exactly is impacted by this? I assume that if there is a difference, this will be reflected in the residuals, right? Could maybe someone help me to understand this with a mathematical derivation?

Attempt 2: incorporation of (serial-cross-)correlations due to a VARX model?

In contrast to the first attempt, the second one simply uses response data also as features. Therby, the effect of this incorporation is understandable to me. So lets say we put our features and responses in a VARX model (e.g. order = 1). In addition let us assume that we only have two responses (sry but I am a MathJax beginner):

$\binom{r^{}_{t+1,1}}{r^{}_{t+1,2}}=\binom{a^{}_{1}}{a^{}_{2}}+\binom{b^{}_{1,1}}{b^{}_{2,1}}*\binom{r^{}_{t,1}}{r^{}_{t,1}}+\binom{b^{}_{1,2}}{b^{}_{2,2}}*\binom{r^{}_{t,2}}{r^{}_{t,2}}+\binom{b^{}_{1,3}}{b^{}_{2,3}}*\binom{x^{}_{t,1}}{x^{}_{t,1}}+\binom{b^{}_{1,4}}{b^{}_{2,4}}*\binom{x^{}_{t,2}}{x^{}_{t,2}}+...+\binom{b^{}_{1,p}}{b^{}_{2,p}}*\binom{x^{}_{t,p}}{x^{}_{t,p}}+\binom{error^{}_{t+1,1}}{error^{}_{t+1,2}}$

However, my question here is if i use for example the vars package to evaluate the VARX model. Does this imply that i could also evaluate "both" models sepperatly with OLS? Or is there any "multivariate" effect of the VARX model (besides the incorporation of lagged endogenous features?)? Moreover, what changes if additional contemporaneous features would enter the model?

I hope I described my problem appropriate as well as somebody is able to help me. I am really looking forward to answers.

Greets!

Bruno
  • 53
  • 1
  • 7

0 Answers0