7

I have two series $x_t,z_t$, and compute the differences like $\Delta_h x_t=x_t-x_{t-h}$. What is a good estimator of the covariance of changes? $$Cov[\Delta_h x_t,\Delta_h z_t]$$ The intervals are overlapping, so the series $\Delta_h x_t$ are autocorrelated, i.e. $Cov[\Delta_h x_t,\Delta_h x_{t+1}]>0$ for $h>1$. Hence, I'm not sure the usual covariance estimator is the best in this situation.

We can assume that $\Delta_1 x_t$ are stationary, and not autocorrelated, if that's necessary. The series themselves are not necessarily stationary, they could be random walk, for instance. Otherwise, I'd rather prefer not to have strong assumptions on the series $x_t,z_t$.

Aksakal
  • 55,939
  • 5
  • 90
  • 176

2 Answers2

1

I faced a similar problem earlier and found some related literature, e.g.

  1. Britten‐Jones, Mark, Anthony Neuberger, and Ingmar Nolte. "Improved inference in regression with overlapping observations." Journal of Business Finance & Accounting 38.5‐6 (2011): 657-683.
  2. Harri, Ardian, and B. Wade Brorsen. "The overlapping data problem." Available at SSRN 76460 (1998).
  3. Hansen, Lars Peter, and Robert J. Hodrick. "Forward exchange rates as optimal predictors of future spot rates: An econometric analysis." The Journal of Political Economy (1980): 829-853.

I do not remember finding any really simple solution in these papers (but my memory cannot be trusted).

I was after correlation (rather than covariance) given overlapping observations. I thought that the following could perhaps help: run a simple regression of $\Delta_h x_t$ on $\Delta_h z_t$ with ARMA errors (such as described in Rob J. Hyndman's blog post "The ARIMAX model muddle"). The ARMA errors should take care of the statistical artifacts resulting from the data being overlapping. The resulting $R^2$ could perhaps be interpreted as the squared correlation. Going from correlations from covariances should not be too difficult. My thinking is only heuristic, but perhaps the idea could be developed and become useful.

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
  • It's an interesting idea. I was looking at something like hac Newey west estimator – Aksakal Jun 01 '15 at 20:18
  • In case of autocorrelated errors, HAC does only half of the job. It accounts for the failure of model assumptions by increasing the standard errors/confidence intervals. Meanwhile, regression with ARMA errors does the whole job. It adjusts the model coefficients to account for the ARMA structure in the errors and does not yield increased standard errors/confidence intervals. Thus I would prefer regression with ARMA errors over HAC. However, this only holds for the case of autocorrelated errors. Whether and how it carries over to overlapping observations remains to be answered. – Richard Hardy Jun 02 '15 at 05:43
  • @Aksakal: i realize that this question is old but I was reading it anyway and got a little confused. Doesn't one have the same problem when calculating covariance estimates of two series even when the two series are not differenced ? I don't see where the differencing part comes into it but I could be confused. It seems like it would be a problem in general and not just when differencing ? Thanks for clarification. – mlofton May 10 '21 at 14:39
  • @mlofton, it is not very convenient to work with (covariances of) random walks, and differencing comes in handy in that respect. Subject-matter considerations may also make differences more interesting than levels. – Richard Hardy May 10 '21 at 14:49
  • Hi Richard: Right. I see what you're saying. But my point was that I think ( but could be wrong ), the issue comes up whether one is dealing in differences of levels that are stationary or levels that are staionary ? Is that correct ? – mlofton May 11 '21 at 18:29
  • @mlofton, not sure I understand the question, but overlapping observations, integrated/stationary series and levels vs. differences are three distinct issues. – Richard Hardy May 11 '21 at 18:53
  • Hi Richard: I'm sorry for not being clear. What I mean is that, suppose you have two stationary series $x$ and $y$ and you want to calculate the covariance between the two series at different lags say lag 1 and lag2 for example. Then, you also have overlapping in that calculation ( say $\rho_{1}$ and $\rho_{2}$ ) also so I'm not seeing where the differencing comes into play ? Thanks. – mlofton May 12 '21 at 20:55
0

If I understand the question well, then in the univariate case the autocorrelation matrix is a Toeplitz matrix.

In the mulitvariate case the the matrix will be a block Toeplitz matrix. In a block Toeplitz matrix all the variables are grouped per time t. That is blocks of correlations for the variables on h=0 on the diagonal, and of diagonal blocks for h>0.

Toeplitz matrices are highly constrained, the number of estimated elements in a Toeplitz matrix is far smaller than in a normal correlation matrix.

spdrnl
  • 2,017
  • 8
  • 11