7

According to this question and answer Explosive AR(MA) processes are stationary? the AR(1) process (with $e_t$ white noise):

$$X_{t}=\varphi X_{t-1}+e_{t} \qquad , e_t \sim WN(0,\sigma)$$

is a stationary process if $\varphi>1$ because it can be rewritten as

$$X_t=\sum_{k=0}^\infty {\varphi}^{-k}u_{t+k}$$

But now the variable $X_t$ depends on the future.


I wonder where this representation (which I remember having seen in several places) and the derivation originally comes from.


I am confused about the derivation, and I wonder how it works. When I try to do the derivation myself I am failing.

I can rewrite the process $$X_{t+1}=\varphi X_{t}+e_{t+1}$$ as $$X_{t}= \varphi^{-1} X_{t+1} -\varphi^{-1} e_{t+1}$$ and replacing $\varphi^{-1} e_{t+1}$ by $u_{t}$ it becomes $$X_{t}= \varphi^{-1} X_{t+1} + u_{t}$$ such that the expression is 'like' another AR(1) process but in reverse time and now the coefficient is below 1 so it seemingly is stationary (*).

From the above it would follow indeed that $$X_t=\sum_{k=0}^\infty {\varphi}^{-k}u_{t+k}$$ (*) But the $u_t$ is not independent from $X_{t+1}$, because it is actually $e_{t+1}$ times a negative constant.

Sextus Empiricus
  • 43,080
  • 1
  • 72
  • 161
  • 1
    "...because it can be rewritten as..." is not really correct and suggests possible misconception. Rather, given the (stochastic) sequence $(\epsilon_t)$, the stochastic difference equation...[AR(1) with root inside unit circle] has a non-causal stationary solution.... "But the $u_t$ is not independent from $X_{t+1}$"---actually it is, if you look at the solution itself:$X_{t+1}$ is a function of $\epsilon_{t+2}, \epsilon_{t+3}, \cdots$, therefore independent of $\epsilon_{t+1}$ (assuming $(\epsilon_t)$ is i.i.d.). – Michael Oct 29 '20 at 01:58
  • @Michael $X_{t+1}$ is by definition a function of $e_{t+1}+e_{t}+e_{t-1}+e_{t-2}...$, and therefore dependent on $e_{t+1}$. So this reversed equation is not similar to the typical AR(1) where the increments are white noise that is independent from the current value. – Sextus Empiricus Oct 29 '20 at 08:00
  • It is maybe like reversing the first shot in a billiards game. In the first shot the balls that are initialy placed in a triangle and get randomly dispersed, the order decreases. When we invert time then the order is not similarly increasing, because the random terms are 'predetermined'. This time-reversal only makes sense when you reset the white noise, which is not truly going backwards in time (at least not in the sense of playing a movie backwards). – Sextus Empiricus Oct 29 '20 at 08:16
  • "Xt+1 is by definition a function of et+1+et+et−1+et−2..., and therefore dependent on et+1"---which "definition"...? If you look at your actual solution, $X_t$ is a function of $e_{t+1}$, $e_{t+2}$, $e_{t+3}$.... Then substitute $t+1$ for $t$. Also, the infinite series you have for $X_t$ does converge in $L^2$, therefore in probability---i.e. there's no "remainder" term as an answer below suggests incorrectly. Therefore the infinite series is well-defined and gives a stationary solution to the AR(1) equation. – Michael Oct 29 '20 at 17:11
  • @Michael, I started with $$X_{t}=\varphi X_{t-1}+e_{t} \qquad , e_t \sim WN(0,\sigma)$$ So $X_{t+1}$ is by definition a function of $e_{t+1}$ and there will be a dependency between $X_{t+1}$ and $e_{t+1}$. I rewrote it in the form $$X_{t}= \varphi^{-1} X_{t+1} -\varphi^{-1} e_{t+1}$$ and here the two terms on the right $X_{t+1}$ and $e_{t+1}$ will be correlated. So if write the solution$$X_t=\sum_{k=0}^\infty {\varphi}^{-k}u_{t+k}$$ then these $u_{t+k}$ should be correlated? – Sextus Empiricus Oct 29 '20 at 17:11
  • No, there is a difference between the *equation* (the AR(1) equation in this case) with $X_t$ as unknown, and the *solution* which solves the equation. They are not the same. You're trying to deduce from the equation the correlation structure of $X_t$ and $e_t$ without considering the actually solution. I could easily write down a stochastic difference equation that has *no* solution and follow your reasoning to conclude that two random variables that do not exist are correlated. – Michael Oct 29 '20 at 17:13
  • This is similar to expressing $$B = A +\epsilon \qquad \text{where: cov$(A,\epsilon)=0$}$$ in terms of $$A = B - \epsilon = B + \epsilon^\prime$$ then compute the variance but (if I ignore correlation between $B$ and $\epsilon^\prime$) it would lead to the contradiction to have both $\text{var}(B) = \text{var}(A) + \text{var}(\epsilon) $ and also $\text{var}(A) = \text{var}(B) + \text{var}(\epsilon) $ – Sextus Empiricus Oct 29 '20 at 17:15
  • 1
    I get that if you reverse time you get a *different*, stationary, process, but I do not see how this solution is stationary (as a movie going backwards in which case billiard balls get perfectly back into place, which is different as turning the physical laws and causation backwards in which chaos will increase and you get a *different* movie, and different error terms, when turning the laws backwards). So this makes me wonder what the *history* of this approach is, where and why did they do this the first time. – Sextus Empiricus Oct 29 '20 at 17:21
  • "This is similar to expressing B=A+e..."---no, it may appear that way but not the same. The difference here is there's an equation with $X_t$'s as unknown's one first needs to solve *then* consider correlation properties. Your solutions for $X_t$ is already a counter e.g. that shows that reasoning is problematic (hence the confusion, I guess). "I do not see how this solution is... stationary"---MA processes with white noise innovations are in general weakly stationary. – Michael Oct 29 '20 at 17:25
  • 1
    As for the historical question---I believe Mann and Wald (1943) already considered non-causal AR(1) case, among other examples. Perhaps one can find references even earlier. – Michael Oct 29 '20 at 17:38
  • @Michael *"Your solutions for Xt is already a counter e.g. that shows that reasoning is problematic "* I get this solution from Brockwell and Davis' *Time series: theory and methods* (eq 3.1.14) as well as from the linked question. In B&D they argue *"which shows, by the same arguments as in the preceding paragraph, that"* But in that preceding paragraph they had the term $\lim_{t\to\infty} X_t/\varphi^t$ that Ben shows can not be eliminated for the process wackwards in time. The argument is that if the series is stationary then the term dissappears (but that is a cyclic argument). – Sextus Empiricus Oct 29 '20 at 17:49
  • The expression of solution is not in question---it is completely standard. It's the confusion between the (AR(1)) equation and the solution, which apparently gave you the impression they contradict each other, which then led to your question. Yes, to show that the infinite series of r.v's actually make sense---i.e. it converges in L^2/in probability/etc---one needs to show that the remainder term converges to zero. Otherwise you don't have a "solution" at all. As Brockwell and Davis shows, it does converges to zero in L^2, therefore in probability. That is also standard. – Michael Oct 29 '20 at 19:22
  • "...[the remainder term] can not be eliminated for the process wackwards in time"---that is clearly incorrect. "...but [Brockwell and Davis] is a cyclic argument..."---this suggests some fundamental misconceptions. Perhaps re-consider just what it means to solve a system of equations, e.g. how to solve an infinite system of deterministic equations. The AR(1) equation is an extension where the solution/unknown is a sequence random variables. – Michael Oct 29 '20 at 19:38

3 Answers3

8

The question suggests some basic confusion between the equation and the solution

The Equation

Let ${\varphi} > 1$. Consider the following (infinite) system of equations---one equation for each $t\in \mathbb{Z}$: $$ X_{t}=\varphi X_{t-1}+e_{t}, \mbox{ where } e_t \sim WN(0,\sigma), \;\; t \in \mathbb{Z}. \quad (*) $$

Definition Given $e_t \sim WN(0,\sigma)$, a sequence of random variables $\{ X_t \}_{t\in \mathbb{Z}}$ is said to be a solution of $(*)$ if, for each $t$, $$ X_{t}=\varphi X_{t-1}+e_{t}, $$ with probability 1.

The Solution

Define $$ X_t= - \sum_{k=1}^\infty {\varphi}^{-k}e_{t+k}, $$ for each $t$.

  1. $X_t$ is well-defined: The sequence of partial sums $$ X_{t,m} = - \sum_{k=1}^m {\varphi}^{-k}e_{t+k}, \;\; m \geq 1 $$ is a Cauchy sequence in the Hilbert space $L^2$, and therefore converges in $L^2$. $L^2$ convergence implies convergence in probability (although not necessarily almost surely). By definition, for each $t$, $X_t$ is the $L^2$/probability-limit of $(X_{t,m})$ as $m \rightarrow \infty$.

  2. $\{ X_t \}$ is, trivially, weakly stationary. (Any MA$(\infty)$ series with absolutely summable coefficients is weakly stationary.)

  3. $\{ X_t \}_{t\in \mathbb{Z}}$ is a solution of $(*)$, as can be verified directly by substitution into $(*)$.

This is a special case of how one would obtain a solution to an ARMA model: first guess/derive an MA$(\infty)$ expression, show that it is well-defined, then verify it's an actual solution.

$\;$

...But the $\epsilon_t$ is not independent from $X_{t}$...

This impression perhaps results from confusing the equation and the solution. Consider the actual solution: $$ \varphi X_{t-1} + e_t = \varphi \cdot \left( - \sum_{k=1}^\infty {\varphi}^{-k}e_{t+k-1} \right) + e_t, $$ the right-hand side is exactly $- \sum_{k=1}^\infty {\varphi}^{-k}e_{t+k}$, which is $X_t$ (we just verified Point #3 above). Notice how $e_t$ cancels and actually doesn't show up in $X_t$.

$\;$

...where this...derivation originally comes from...

I believe Mann and Wald (1943) already considered non-causal AR(1) case, among other examples. Perhaps one can find references even earlier. Certainly by the time of Box and Jenkins this is well-known.

Further Comment

The non-causal solution is typically excluded from the stationary AR(1) model because:

  1. It is un-physical.

  2. Assume that $(e_t)$ is, say, Gaussian white noise. Then, for every non-causal solution, there exists a causal solution that is observationally equivalent, i.e. the two solutions would be equal as probability measures on $\mathbb{R}^{\mathbb{Z}}$. In other words, a stationary AR(1) model that includes both causal and non-causal cases is un-indentified. Even if the non-causal solution is physical, one cannot distinguish it from a causal counterpart from data. For example, if innovation variance $\sigma^2 =1$, then the causal counterpart is causal solution to AR(1) equation with coefficient $\frac{1}{\varphi}$ and $\sigma^2 =\frac{1}{\varphi^2}$.

Michael
  • 2,853
  • 10
  • 15
  • 1
    A detailed answer following a fascinating discussion in the comments. I love it! – Richard Hardy Oct 29 '20 at 20:21
  • I like this answer for it's clear explanation and especially the reference to Mann and Wald. – Sextus Empiricus Oct 29 '20 at 20:25
  • You are saying that the non-causal solution is excluded because it is unphysical. I believe that this is how I look at the equation and what is maybe the source of confusion that leads to my impression (if it is actually confusion). I can see that your solution works by plugging it into the equation, but I regard this as an unreal solution because I see the equation as some sort of physical process for which the causality can not be reversed in time... – Sextus Empiricus Oct 29 '20 at 20:33
  • ... If I compare the variance of $X_t$ as function of the variance of $X_{t+1}$, then I believe the solution can not be used to determine the variance. In my eyes the relationship of variances $$Var(X_t) = \varphi^{-1}Var(X_{t+1}) + \varphi^{-1} Var(\epsilon_t)$$ which would follow from that solution, is not right. Or contradicts with $$Var(X_{t+1}) = \varphi Var(X_{t}) + Var(\epsilon_{t+1})$$ It is a difference whether we see $X_t$ as a function of $X_{t+1} + \text{noise}$ or $X_{t+1}$ as a function of $X_t + \text{noise}$. But this difference in direction is not clear in the equation. – Sextus Empiricus Oct 29 '20 at 20:43
  • A simpler analogy. The equation $$A-B = \epsilon$$ with $\epsilon$ standard normal distributed and $A,B$ a bivariate normal distribution with covariance some value $\phi$ has indeed *two* solutions (due to the symmetry). But in the context of a causal model only one of them is 'accepted' or physical. E.g. when we speak of 'A is equal to B with noise *added* to it' then only one of the two solutions makes sense. – Sextus Empiricus Oct 29 '20 at 20:53
  • This is what distrurbs me when thinking of the non-causal solution (It is a solution of the equation, but is it also a solution of the model? If we specify the equation/model in more detail such that $X_t$ is independent of $\epsilon_s$ for $s>t$ then the solution is not valid anymore) – Sextus Empiricus Oct 29 '20 at 20:57
  • While this is a lovely answer (+1), at the moment, Eqn $(*)$ does not hold (so point 3 is incorrect). Try substitution and you will see that you get $2e_t$ as a remainder. You need to amend it to put a minus sign on the summation. – Ben Oct 29 '20 at 22:13
  • @SextusEmpiricus "A simpler analogy. The equation A-B=e..."---I don't know if this analogy works here. Take e to be deterministic, say e = 2. Then the equation A-B=2 has infinitely many solutions (A,B). If e is, say, some N(0,1), r.v., then equation clearly also has infinitely many solutions. It is "under-identified". The AR(1) equation has (much) more structure. – Michael Oct 29 '20 at 23:20
  • @Michael, I mean that the symbols refer to random variables. So maybe I should have written $X-Y \sim N(0,\sigma)$. Then it has a certain symmetry, with multiple solutions. Or if we write $X = Y + \epsilon$ where $\epsilon\sim N(0,\sigma)$ then this has the same symmetry, but often this is interpreted in a single direction as 'X is the random variable obtained by adding the random variables Y and $\epsilon$'. (and often we see the added $\epsilon$ as independent, but for the AR(1) equation $X_t + \phi X_{t-1} = \epsilon_t$ this independence is not made explicitly so there are more solutions.) – Sextus Empiricus Oct 30 '20 at 00:23
2

Re-arranging your first equation and increasing the index by one gives the "reverse" AR(1) form:

$$X_{t} = \frac{1}{\varphi} X_{t+1} - \frac{e_{t+1}}{\varphi}.$$

Suppose you now define the observable values using the filter:

$$X_t = - \sum_{k=1}^\infty \frac{e_{t+k}}{\varphi^k}.$$

You can confirm by substitution that both the original AR(1) form and the reversed form hold in this case. As pointed out in the excellent answer by Michael, this means that the model is not identified unless we exclude this solution by definition.

Ben
  • 91,027
  • 3
  • 150
  • 376
  • I like your brief answer. I can see that this 'other' solution occurs because the equations do not necessarily have a causal interpretation (and my confusion with it is that I did give the equations a causal interpretation, as a formula for an iterative scheme). It is sort of like computing the size of an object with a particular area as the solution of $l^2\propto A$, which allows negative size. – Sextus Empiricus Oct 29 '20 at 23:04
  • But I still wonder about my main question where this alternative solution originally comes from. – Sextus Empiricus Oct 29 '20 at 23:06
  • Yes, fair enough. I'm sorry I was unable to answer that part --- my knowledge of the original time-series books is thin. – Ben Oct 29 '20 at 23:43
1

... the AR(1) process (with $e_t$ white noise):

$$X_{t}=\varphi X_{t-1}+e_{t} \qquad , e_t \sim WN(0,\sigma)$$

is a stationary process if $\varphi>1$ because ...

It seems me not possible as showed there: https://en.wikipedia.org/wiki/Autoregressive_model#Example:_An_AR(1)_process

for wide sense stationarity $-1 < \varphi < 1$ must hold.

Moreover, maybe I lose something here but it seems me that not only the process above cannot be stationary but it is entirely impossible and/or bad defined. This because if we have an autoregressive process, we do not stay in a situation like $Y=\theta Z+u$ where $Z$ and $u$ can be two unrestricted random variables and $\theta$ an unrestricted parameter.

In a regression residuals and parameters are not free terms, given dependent and independent/s variables, they are given too.

So, in AR(1) case it is possible to show that $-1 \leq \varphi \leq 1$ must hold; like autocorrelation.

Moreover if we assume that $e_t$ (residuals) are white noise process ... we make a restriction on $X_t$ process too. If in the data we estimate an AR(1) and $e_t$ result as autocorrelated ... the assumption/restriction do not hold ... AR(1) is not a good specification.

markowitz
  • 3,964
  • 1
  • 13
  • 28