AR(1) with known initial and terminal condition: how to draw the innovations?

Question

Suppose I have the following stationary $AR(1)$ process:

$$ y_{t}=\alpha_{0}+\alpha_{1}y_{t-1} + u_{t} $$

where $u_{t} \sim \mathbb{N}(0,\sigma^{2})$, with $\sigma^{2}$ known. Suppose I have an initial condition $y_{1}$ and terminal condition $y_{T}$ and I would like to simulate my process for the periods in the interim, i.e $t = 2,\dots,T-1$. After substituting recursively, I can write the final term $y_{T}$ as

$$ y_{T} = \alpha_{0} \left(\frac{1-\alpha_{1}^{T-1}}{1-\alpha_{1}} \right) + \alpha_{1}^{T-1}y_{1} + \sum_{t=0}^{T-2} \alpha_{1}^{t}u_{T-t} $$

Since $y_{T}$ and $y_{1}$ are known, this suggests that the sum of the innovations should equal

$$ \sum_{t=0}^{T-2} \alpha_{1}^{t}u_{T-t} = y_{T} - \alpha_{0} \left(\frac{1-\alpha_{1}^{T-1}}{1-\alpha_{1}} \right) - \alpha_{1}^{T-1}y_{1} $$

Now, my question is what is the correct way to draw the sequence of $u_{t}$ knowing that they are $iid$ and unconditionally distributed as a $\mathbb{N}(0,\sigma^{2})$.

The way I am trying to approach this problem is following the comment of @soakley on this post Sample random variables conditional on their sum . However, I am not sure about the proper iterative procedure.

Clearly, I could treat the $\alpha_{1}^{t}$ as weights. But I am still confused. The way I proceed is as follows:

0) Define $z_{2}=y_{T} - \alpha_{0} \left(\frac{1-\alpha_{1}^{T-1}}{1-\alpha_{1}} \right) - \alpha_{1}^{T-1}y_{1}$

1) Draw $u_{t}\sim \mathbb{N}\left( \frac{z_{t}}{T-t+1},\frac{T-t}{T-t+1} \sigma^{2} \right)$ for $t \in [2,T-1]$

2) In each step update $z_{t}$ as follows

$$ z_{t+1} = z_{t} - \alpha^{T-t} u_{t} $$

3) Let $u_{T} = z_{T}$

I would really like to know if this procedure is correct.

Hi: are you missing a lagged $y_t$ in the first equation because, as it stands, that's not AR(1) process. If so, it's an interesting problem but I wanted to make sure of that first before I thought about it. thanks. — mlofton, Apr 04 '20 at 18:13
@mlofton yes, I am really sorry. It was a typo. I am assuming I know the parameters $\alpha_{0}$ and $\alpha_{1}$ as well. — econ_ugrad, Apr 06 '20 at 12:22
This reminds me of an interpolating technique called 'Kriging', which is also often called a 'Gaussian process'. Might be interesting to look that up. — LBogaardt, Apr 08 '20 at 21:38

Ben · Accepted Answer · 2020-04-11T00:02:33.563

2

Since the overarching goal here is to generate the $\text{AR}(1)$ series values, I will show you how you can generate these directly, rather than via the error terms. (The method of generating the error terms is very similar.) The way to do this is to write out the multivariate normal distribution for the process and then use the rules for the conditional distribution for a multivariate normal. I will use more standard notation by using the form:

$$Y_t = \mu + \phi (Y_{t-1} - \mu) + u_t \quad \quad \quad u_t \sim \text{IID N}(0, \sigma^2).$$

The stationary form of this model is for vectors of $y_i$ values to be distributed by a multivariate normal distribution, with variance matrix determined by the auto-covariance function of the process. Thus, when you want conditional distributions, you can use standard results for conditioning in the multivariate normal distribution.

Conditional distribution given endpoints: The auto-correlation function for the $\text{AR}(1)$ process above is:

$$\rho(t) = \frac{\phi^t}{1-\phi^2} \quad \quad \quad \text{for } t \in \mathbb{Z}.$$

Thus, the marginal distribution of the vector $\mathbf{Y} = (Y_0,...,Y_T)$ is:

$$\mathbf{Y} \sim \text{N} \Bigg( \mu \mathbf{1}, \sigma^2 \mathbf{\Sigma}(\phi) \Bigg) \quad \quad \quad \mathbf{\Sigma}(\phi) \equiv \frac{1}{1-\phi^2} \begin{bmatrix} 1 & \phi & \phi^2 & \cdots & \phi^T \\ \phi & 1 & \phi & \cdots & \phi^{T-1} \\ \phi^2 & \phi & 1 & \cdots & \phi^{T-2} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ \phi^T & \phi^{T-1} & \phi^{T-2} & \cdots & 1 \\ \end{bmatrix}.$$

To obtain the conditional distribution of interest, we decompose the random vector $\mathbf{Y}$ as:

$$\mathbf{y} = \begin{bmatrix} \mathbf{Y}_\text{INT} \\ \mathbf{Y}_\text{END} \end{bmatrix},$$

where $\mathbf{Y}_\text{INT} = (Y_1,...,Y_{T-1})$ and $\mathbf{Y}_\text{END} = (Y_0,Y_T)$. We likewise decompose the mean vector and variance matrix as:

$$\mu \cdot \mathbf{1} = \begin{bmatrix} \mu \cdot \mathbf{1} \\ \mu \cdot \mathbf{1} \end{bmatrix} \quad \quad \quad \mathbf{\Sigma}(\phi) = \begin{bmatrix} \mathbf{\Sigma}_\text{INT}(\phi) & \mathbf{\Sigma}_\text{CROSS}(\phi) \\ \mathbf{\Sigma}_\text{CROSS}(\phi)^\text{T} & \mathbf{\Sigma}_\text{END}(\phi) \end{bmatrix},$$

where the mean vectors are of appropriate lengths and the variance parts are:

$$\begin{aligned} \mathbf{\Sigma}_\text{INT}(\phi) &\equiv \frac{1}{1-\phi^2} \begin{bmatrix} 1 & \phi & \phi^2 & \cdots & \phi^{T-2} \\ \phi & 1 & \phi & \cdots & \phi^{T-3} \\ \phi^2 & \phi & 1 & \cdots & \phi^{T-4} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ \phi^{T-2} & \phi^{T-3} & \phi^{T-4} & \cdots & 1 \\ \end{bmatrix}, \\[30pt] \mathbf{\Sigma}_\text{CROSS}(\phi) &\equiv \frac{1}{1-\phi^2} \begin{bmatrix} \phi & \phi^{T-1} \\ \phi^2 & \phi^{T-2} \\ \phi^3 & \phi^{T-3} \\ \vdots & \vdots \\ \phi^{T-1} & \phi \\ \end{bmatrix}, \\[30pt] \mathbf{\Sigma}_\text{END}(\phi) &\equiv \frac{1}{1-\phi^2} \begin{bmatrix} 1 & \phi^{T} \\ \phi^{T} & 1 \\ \end{bmatrix}. \\[6pt] \end{aligned}$$

Now, using the conditional density rule for the multivariate normal distribution, we have:

$$p(\mathbf{y}_\text{INT}| \mathbf{y}_\text{END}) = \text{N}(\mathbf{y}_\text{INT}| \boldsymbol{\mu}_*, \boldsymbol{\Sigma}_*),$$

where:

$$\begin{equation} \begin{aligned} \boldsymbol{\mu}_* &\equiv \mu \cdot \mathbf{1} + \boldsymbol{\Sigma}_\text{CROSS}(\phi) \boldsymbol{\Sigma}_\text{END}(\phi)^{-1} (\boldsymbol{y}_\text{END} - \mu \cdot \mathbf{1}), \\[8pt] \boldsymbol{\Sigma}_* &\equiv \boldsymbol{\Sigma}_\text{INT}(\phi) - \boldsymbol{\Sigma}_\text{CROSS}(\phi) \boldsymbol{\Sigma}_\text{END}(\phi)^{-1} \boldsymbol{\Sigma}_\text{CROSS}(\phi)^\text{T}. \\[6pt] \end{aligned} \end{equation}$$

This gives you the conditional distribution of the observable values in the process. If you particularly want the error terms, you can extract them from the observed values using the recursive equation for the process.

edited Apr 11 '20 at 00:02

answered Apr 10 '20 at 09:20

Ben

91,027
3
150
376

Ben: This is interesting but if he wants the terminal condition to be $Y_T$, how can there be a distribution ? Doesn't one need a deterministic condition to be sure that one ends up at $Y_T$ ? Thanks. – mlofton Apr 10 '20 at 21:55
This method gives the conditional distribution of $\mathbf{y}_\text{INT} = (y_1,...,y_{T-1})$ conditional on $y_0$ and $y_T$. So the latter values can be any conditioning values entered by the user. The endpoints are fixed, but the interior points between them are still random. – Ben Apr 10 '20 at 23:33
It's definitely possible that I mis-interpreted the question. I took it that one has the values of $y_{0}$ and $y_{T}$ and the question is: what do the noise terms between $t=1$ and $t=(T-1)$ need to be in order to ensure that the path of $y$ ends up at the value $y_{T}$. My answer was generate the noise terms from $t=2$ to $t=T-1$ as $Normal(0, \sigma^2)$. Then, set $\epsilon_{T}$ to the function in the answer. This will ensure that $y$ reaches $y_T$ at time $t=T$. But what you did is pretty different and interesting. No idea whose interpretation is correct. Thanks. – mlofton Apr 10 '20 at 23:57
Your interpretation is correct, but unfortunately your method is wrong. Once you condition on $y_0$ and $y_T$, the error terms between these values are no longer independent, so it is not legitimate to generate them independently. The method I use bypasses the errors and generates $y_1,...,y_{T-1}$ directly, but you could certainly extract the error terms from the observed values if that is preferred. – Ben Apr 11 '20 at 00:00
But I don't see how you can ensure consistency in your model. If $y_{T} -\mu = \phi \times (y_{T-1} - \mu) + \epsilon_{T}$ and $y_{T}$ is known and fixed, then $\epsilon_{T}$ needs to be fixed and not an RV ? Essentially, it needs to be a function of your $Y_{INT}$. – mlofton Apr 11 '20 at 00:07
Oh, note that I am not conditioning on $Y_{T}$. I'm describing what $\epsilon_{T}$ needs to be in order to be sure that one ends up at the value $Y_{T}$. A totally different question from yours. Maybe the OP can tell us whose interpretation was correct. Thanks for interesting derivation. – mlofton Apr 11 '20 at 00:10
But since $\mathbf{y}_\text{INT}$ is random, the error term $u_t$ is also random (since it depends on the former. – Ben Apr 11 '20 at 00:29
Hi Ben: Not to beat a dead-horse but if $Y_T$ is supposed to be some known value and also be consistent with your equation for the AR(1), how can that be guaranteed if $Y_{INT}$ is random. There has to be something that's deterministic in order to A) obtain the value of $Y_T$ at time $t = T$ and B) have $Y_T$ adhere to the AR(1) relation. Essentially, if $\epsilon_T$ is random, then one can't guarantee A) and B) – mlofton Apr 11 '20 at 02:18
That is not correct --- all of the error terms are still (marginally) random, but they are now constrained by two linear equations. The error term $u_t$ is only determined if we condition on both $y_T$ and $y_{T-1}$. – Ben Apr 11 '20 at 05:05
(An analogy would be if you said you have random variables $x_1,...,x_{10}$ with known sample mean $\bar{x} = 6$. Conditional on that information, each of the values is still random, but once you specify any nine of them, the last one becomes determined.) – Ben Apr 11 '20 at 05:06
1

So I think you're saying that $y_T$ will satisfy the AR(1) relationship at the top of your explanation when using your approach. If so, then I believe you but I don't see it. That's okay ( there are many things I don't understand and live with :) ) and thanks for explaining. – mlofton Apr 11 '20 at 22:05
I'm not saying your wrong but note that, when you do your derivation, you're only assumption is multivariate normality and the correct and derived cov matrix. There is nothing in there that restricts the sequence of $Y_t$. By this, I mean, that the conditional mean and variance relationships are preserved but there's nothing in there saying that the AR(1) process itself has to be ensured at each time $t$. This is why I don't see why the AR(1) relation itself will end up holding at each $t$. But that's okay. We don't have to discuss this anymore and I appreciate your patience. – mlofton Apr 11 '20 at 22:18
No problem --- I think your misgivings here come from that last point. In fact, the specification of the multivariate normal joint density does satisfy the underlying reqursive equation for AR(1) (i.e., the error terms induced by this distribution can be shown to be IID normal). The distribution I have specified is the stationary distribution for the Gaussian AR(1) process. – Ben Apr 12 '20 at 00:18
1

I wrote an R program to simulate your solution and that really helped me to see what's going on. They can satisfy the recursive relationships because of the $\epsilon$ terms which is really where all the auto-correlation stems from in the first place. I will post the code in an "answer" in case anyone wants it. It's useful for seeing what actually happens when Ben's algorithm is used. Thanks Ben for your help and patience. – mlofton Apr 12 '20 at 03:30
It's easier if I attached the R code but I can't find the way to include an attachment. thanks for any pointers. – mlofton Apr 12 '20 at 03:50
@mlofton: Note on your answer with the R code (presently hidden). You can display R code properly by adding four spaces prior to each code line. You should also add a ```#``` at the start of each comment line so that the R processor does not read it. I have taken the liberty of editing your answer to make these changes. – Ben Dec 23 '20 at 23:32

mlofton · Answer 2 · 2020-04-10T07:01:24.497

Here is my revised and (hopefully) final attempt.

Suppose we have the AR(1) process: $y_{t}= \alpha \times y_{t-1} + \epsilon_t$ and we know $y_1$ and $y_{T}$ and want to figure out the innovations that will start at $t = 2$ and bring us to $y_{T}$.

So, suppose that there is an innovation at time $t = 2$ and it is $\epsilon_{2}$ and similarly a new $\epsilon_{3}$ at $t= 3$ and the same thing upto and including $t = {T-1}$.

Then, what will this result in as far as $y_{2}, y_{3}, \ldots y_{T-1}$.

Well, then we have ( don't ask what I was doing in my earlier attempts ):

$y_{2} = \alpha \times y_{1} + \epsilon_{2} $

$y_{3} = \alpha^2 y_{1} + \alpha \times \epsilon_{2} + \epsilon_{3} $

$y_{4} = \alpha^3 y_{1} + \alpha^2 \times \epsilon_{2} + \alpha \epsilon_{3} + \epsilon_{4} $

So, at time $t={T-1}$, we have

(1) $y_{(T-1)} = \alpha^{T-1} y_{T} + \sum_{i=2}^{T-1} \epsilon_{i} \alpha^{T-i}$

But, since $y_{T} = \alpha \times y_{T-1} + \epsilon_{T}$, this leads to

(3) $ y_{T} = \alpha^{T} y_{1} + \alpha \sum_{i=2}^{T-1} \epsilon_{i} \alpha^{T-i} + \epsilon_{T}$

Notice that the last term is obtained by taking the term at time t=T out of the summation. This makes it very easy to solve for $\epsilon_{T}$ algebraically which leads to:

(4) $\epsilon_{T} = y_{T} - \alpha^{T} y_{1} - \alpha \sum_{i=2}^{T-1} \epsilon_{i} \alpha^{T-i}$.

Everything on the RHS of (4) is known which means that $\epsilon_{T}$ is a function of variables whose values are all known at time $t = T$.

And my sincere apologies for all the noise and mistakes. Your expressions were totally correct so maybe the procedure that you describe and the one I desribe here are equivalent ? I would think that the values of $\epsilon_{t}$ from $t=2$ to $t=T-1$ can be anything as long as $\epsilon_{T}$ is the deterministic function of them given by (4). But I've lost all confidence in my ability to solve this problem correctly so correct me if you're thinking is different. :).

@Maney: I made another mistake during my second attempt so I'm gonna to post a new answer and then delete the other ones. Man, mistakes can really be wrong :). My apologies. — mlofton, Apr 09 '20 at 06:42

Ben · Answer 3 · 2021-01-10T05:26:35.400

Use the `rGARMA` function in the `ts.extend` package

(Note: This answer augments my other theory-based answer, the method for which has now been programmed into the present package.)

You can generate random vectors from the marginal or conditional distributions of any stationary Gaussian ARMA model using the ts.extend package. This package generates random vectors directly form the multivariate normal distribution using the computed autocorrelation matrix for the random vector, so it gives random vectors from the exact distribution and does not require "burn-in" iterations. It also allows you to impose conditioning values and in this case the package generates from the resulting conditional distribution.

Here is an example from an $\text{AR}(1)$ model where you generate values at the terminal points at either end of the process. (Here I will use a model with parameters $\alpha_0 = 0$, $\alpha_1 = 0.4$ and $\sigma=1$. I will generate twelve random vectors from this model each with $m=30$ points with terminal conditioning points $y_1=-3$ and $y_m=3$.)

#Load the package
library(ts.extend)

#Set parameters
AR <- 0.4
m  <- 30
y1 <- -3
ym <- 3

#Set the conditioning vector
CONDVALS    <- rep(NA, m)
CONDVALS[1] <- y1
CONDVALS[m] <- ym

#Generate n = 12 random vectors from this model
set.seed(1)
SERIES <- rGARMA(n = 12, m = m, ar = AR, condvals = CONDVALS)

#Plot the series using ggplot2 graphics
library(ggplot2)
plot(SERIES)

As you can see from the image below, each of the plots shows a time-series vector that has the stipulated starting and ending points. The interior points are generated from the conditional distribution for the process. As you can see in the code, the syntax for this is quite simple --- all you need to do is to create the vector CONDVALS with the stipulated conditioning values and add this as an argument for the function.

AR(1) with known initial and terminal condition: how to draw the innovations?

3 Answers3

Use the rGARMA function in the ts.extend package

Use the `rGARMA` function in the `ts.extend` package