In this question a commenter says that "differencing a series that is not integrated is certainty problematic from the statistical perspective". What is an integrated time series, and why is differencing a series that is not integrated problematic?

- 54,375
- 10
- 95
- 219

- 489
- 4
- 18
-
3I guess the main question is, and the title should be, *Why is differencing a series that is not integrated problematic?*. Meanwhile, *What is an integrated time series?* is really basic: *integrated* essentially means a series has a unit root; see e.g. [here](https://stats.stackexchange.com/tags/unit-root/info). – Richard Hardy Sep 13 '17 at 19:31
1 Answers
Consider the first difference $\Delta u_t$ of a linear process (a fairly general way of stating that something does not have a unit root) $u_t=\sum_{j=0}^\infty\psi_j\epsilon_{t-j}$ with $\psi_0=1$ and $\sum_{j=0}^\infty|\psi_j|<\infty$, i.e. $$ \Delta u_t=\sum_{j=0}^\infty\psi_j\epsilon_{t-j}-\sum_{j=0}^\infty\psi_j\epsilon_{t-j-1} $$ The long-run variance of $\Delta u_t$ is zero, so that a stationary process should not be differenced "too" often, as the estimated long-run variance for example enters the denominator of t-ratios, and having a population quantity that is zero should not be in a denominator.
We find the $MA$ coefficient sequence of $\Delta u_t$, call it $d(L)$. We then show that $d(1)^2=0$.
Write $$ \Delta u_t=\epsilon_t+\sum_{j=1}^\infty(\psi_j-\psi_{j-1})\epsilon_{t-j}\equiv\sum_{j=0}^\infty d_j\epsilon_{t-j} $$ with $d_0=\psi_0=1$ and $d_j=\psi_j-\psi_{j-1}$. Hence $\sum_{j=0}^\infty d_j=1+\psi_1-\psi_{0}+\psi_2-\psi_{1}+\psi_3-\psi_{2}+\ldots=0$.
The long-run variance can be written as $J=\sigma^2(\sum_{j=0}^\infty d_j)^2$. Hence, $J=0$.
This is because, in general, the long-run variance of an $MA(\infty)$ process $Y_t=\mu+\sum_{j=0}^\infty\psi_j\epsilon_{t-j}$ can be written as $$ J=\sigma^2\biggl(\sum_{j=0}^\infty\psi_j\biggr)^2 $$ Take $\sigma^2=1$ w.l.o.g. Writing out the right-hand side gives \begin{eqnarray*} \biggl(\sum_{j=0}^\infty\psi_j\biggr)^2&=&\psi_0\psi_0+\psi_0\psi_1+\psi_0\psi_2+\psi_0\psi_3+\ldots\\ &&+\quad\psi_1\psi_0+\psi_1\psi_1+\psi_1\psi_2+\psi_1\psi_3+\ldots\\ &&+\quad\psi_2\psi_0+\psi_2\psi_1+\psi_2\psi_2+\psi_2\psi_3+\ldots\\ &&+\quad\psi_3\psi_0+\psi_3\psi_1+\psi_3\psi_2+\psi_3\psi_3+\ldots\\ &=&\ldots\\ &=&\sum_{j=0}^\infty\psi_j^2+2\sum_{j=0}^\infty\psi_j\psi_{j+1}+2\sum_{j=0}^\infty\psi_j\psi_{j+2}+2\sum_{j=0}^\infty\psi_j\psi_{j+3}+\ldots\\ &=&\gamma_0+2\gamma_1+2\gamma_2+2\gamma_3+\ldots\\ &=&J \end{eqnarray*} where the second-to-last line uses expressions for autocovariances of $MA(\infty)$-processes.

- 25,948
- 3
- 57
- 106
-
It cannot be right that the stationary variance $J$ of the differenced series is zero. I believe your last equation should instead read $J=\sigma^2 \sum_{j=0}^\infty (d_i^2)$. Hence, $J$ is far from zero. – Jarle Tufto Sep 15 '17 at 10:28
-
I edited my answer for some additional detail. I do not see a mistake. – Christoph Hanck Sep 15 '17 at 10:53
-
But clearly, $J=\operatorname{Var}Y_t=\operatorname{Var}(\sum_{j=0}^\infty \psi_j \epsilon_{t-j})=\sum_{j=0}^\infty\operatorname{Var}( \psi_j \epsilon_{t-j})=\sum_{j=0}^\infty \psi_j^2\operatorname{Var}( \epsilon_{t-j})=\sigma^2\sum_{j=0}^\infty \psi_j^2$ – Jarle Tufto Sep 15 '17 at 11:03
-
Clearly not, because $Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y)$, not $Var(X)+Var(Y)$. – Christoph Hanck Sep 15 '17 at 11:05
-
-
Note $J$ is not the variance of $Y_t$ - see e.g. https://stats.stackexchange.com/questions/153444/what-is-the-long-run-variance – Christoph Hanck Sep 15 '17 at 11:09
-