17

How is long run variance in the realm of time series analysis defined?

I understand it is utilized in the case there is a correlation structure in the data. So our stochastic process would not be a family of $X_1, X_2 \dots$ i.i.d. random variables but rather only identically distributed?

Could I have a standard reference as an introduction to the concept and the difficulties involved in its estimation?

Danica
  • 21,852
  • 1
  • 59
  • 115
Monolite
  • 1,141
  • 3
  • 13
  • 24
  • related: https://stats.stackexchange.com/questions/154070/where-is-the-dominated-convergence-theorem-being-used/273021#273021 – Taylor Feb 22 '18 at 04:58

1 Answers1

18

It is a measure of the standard error of the sample mean when there is serial dependence.

If $Y_t$ is covariance stationary with $E(Y_t)=\mu$ and $Cov(Y_t,Y_{t-j})=\gamma_j$ (in an iid setting, this quantity would be zero!) such that $\sum_{j=0}^\infty|\gamma_j|<\infty$. Then $$\lim_{T\to\infty}\{Var[\sqrt{T}(\bar{Y}_T- \mu)]\}=\lim_{T\to\infty}\{TE(\bar{Y}_T- \mu)^2\}=\sum_{j=-\infty}^\infty\gamma_j=\gamma_0+2\sum_{j=1}^\infty\gamma_j,$$ where the first equality is definitional, the second a bit more tricky to establish and the third a consequence of stationarity, which implies that $\gamma_j=\gamma_{-j}$.

So the problem is indeed lack of independence. To see this more clearly, write the variance of the sample mean as \begin{align*} E(\bar{Y}_T- \mu)^2&=E\left[(1/T)\sum_{t=1}^T(Y_t- \mu)\right]^2\\ &=1/T^2E[\{(Y_1- \mu)+(Y_2- \mu)+\ldots+(Y_T- \mu)\}\\ &\quad\{(Y_1- \mu)+(Y_2- \mu)+\ldots+(Y_T- \mu)\}]\\ &=1/T^2\{[\gamma_0+\gamma_1+\ldots+\gamma_{T-1}]+[\gamma_1+\gamma_0+\gamma_1+\ldots+\gamma_{T-2}]\\ &\quad+\ldots+[\gamma_{T-1}+\gamma_{T-2}+\ldots+\gamma_1+\gamma_0]\} \end{align*}

A problem with estimating the long-run variance is that we of course do not observe all autocovariances with finite data. Kernel (in econometrics, "Newey-West" or HAC estimators) are used to this end,

$$ \hat{J_T}\equiv\hat{\gamma}_0+2\sum_{j=1}^{T-1}k\left(\frac{j}{\ell_T}\right)\hat{\gamma}_j $$ $k$ is a kernel or weighting function, the $\hat\gamma_j$ are sample autocovariances. $k$, among other things must be symmetric and have $k(0)=1$. $\ell_T$ is a bandwidth parameter.

A popular kernel is the Bartlett kernel $$k\left(\frac{j}{\ell_T}\right) = \begin{cases} \bigl(1 - \frac{j}{\ell_T}\bigr) \qquad &\mbox{for} \qquad 0 \leqslant j \leqslant \ell_T-1 \\ 0 &\mbox{for} \qquad j > \ell_T-1 \end{cases} $$ Good textbook references are Hamilton, Time Series Analysis or Fuller. A seminal (but technical) journal article is Newey and West, Econometrica 1987.

Christoph Hanck
  • 25,948
  • 3
  • 57
  • 106
  • Thank you! I checked Time series Analysis by Hamilton. It does in fact say that a non-parametric way to estimate the spectrum is to take a weighted average of the sample covariances but It does not delve into the mathematics behind the determination of this statement. Could you suggest a reference book or paper that explains why is this a good estimator when the sample size increases? – Monolite May 22 '15 at 11:21
  • good point. Made some edits – Christoph Hanck May 22 '15 at 11:25
  • 1
    It perhaps worth mentioning that the second ("tricky") step requires dominated convergence (see https://stats.stackexchange.com/questions/154070/where-is-the-dominated-convergence-theorem-being-used ). – Tamas Ferenci Nov 21 '18 at 08:14
  • @TamasFerenci, thanks for the pointer, I included the link. – Christoph Hanck Nov 21 '18 at 13:56
  • @Cristoph Hanck, you're welcome, thanks for the update! – Tamas Ferenci Nov 23 '18 at 22:25