Calculating bias of ML estimate of AR(1) coefficient

Question

I am trying to develop adjustment factors for maximum-likelihood estimates of the auto-regression coefficient in an AR(1) process. By simulation I have discovered that the estimates are positively biased but consistent. I would like to produce a table of bias factors by number of observations N in the time series and population coefficient, which I would then try to convert into a table of correction factors by N and estimate value.

Looking at related posts on here I see confirmation that the estimate is biased and consistent, and there is a formula for the asymptotic variance. there is also a thread about the bias from using MA(1) to approximate AR(1). But I could not find any formula for the bias of a MLE estimate of teh AR(1) coefficient.

If no formula is available, I can use simulation to produce my table. But I would rather do it analytically if possible. The estimator is:

$$\hat\theta = \frac{\sum_t y_t y_{t-1}}{\sum_t y_{t-1}{}^2}$$

where $\theta$ is the AR coefficient and $y_1,...,y_N$ are the observed time series values. What I am looking for is how to calculate $E[\hat\theta]$ in terms of $\theta$ and $N$. For the time series I am working with, the instantaneous volatility $\sigma$ is always 1.

Any suggestions on how I could develop a formula for $E[\hat\theta]$, $E[\hat\theta] - \theta$ or $E[\hat\theta]/\theta$ would be gratefully received. Ditto if you can advise that there is no known solution, in which case I will just proceed to use simulation.

Thank you

Do you have a specific reason for de-biasing the estimator? De-biasing may lead to higher variance, and so your expected squared error may increase. In other words, de-biasing may make the estimator all too noisy so that in practice you would rather have some bias with a decent precision in place of no bias and awful precision. Personally, I do not think de-biasing this estimator is a good idea. — Richard Hardy, May 10 '19 at 07:28
Thank you for your comment Richard. The de-biasing approach I have in mind is multiplying the estimate by a correction factor that is independent of the observations, similar to how a sample variance - the ML estimator of population variance - is converted to an unbiased estimate of population variance by multiplying by n/(n-1). If that is possible, it will slightly decrease estimator variance because the correction factor will be slightly less than 1. — Andrew Kirk, May 11 '19 at 01:27
Perhaps relevant (especially the beginning of the paper): Reschenhofer ["Heteroscedasticity-robust estimation of autocorrelation"](https://www.tandfonline.com/doi/full/10.1080/03610918.2017.1408826) (2019). — Richard Hardy, May 11 '19 at 11:27

Ben · Answer 1 · 2020-09-21T04:10:36.447

For this type of estimate, it is quite difficult to get a useful expression for the distribution of the estimator, and hence, for the expected value. The behaviour of the estimator can be related back to a normal quadratic form, but this gives quite a complicated expression.

Distribution of the times series vector: Suppose we use the centered $\text{AR}(1)$ model $y_t = \phi y_{t-1} + \varepsilon_t$ with homoscedastic normal error terms and $|\phi|<1$ for stationarity. We can write the distribution of the observable time series vector $\mathbb{y} = (y_1,...,y_n)$ as:

$$\mathbf{y} \sim \text{N}( \mathbf{0},\mathbf{\Sigma}(\phi)) \quad \quad \quad \mathbf{\Sigma}(\phi) = \frac{\sigma^2}{1-\phi^2} \begin{bmatrix} 1 & \phi & \phi^2 & \cdots & \phi^{n-2} & \phi^{n-1} \\ \phi & 1 & \phi & \cdots & \phi^{n-3} & \phi^{n-2} \\ \phi^2 & \phi & 1 & \cdots & \phi^{n-4} & \phi^{n-3} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ \phi^{n-2} & \phi^{n-3} & \phi^{n-4} & \cdots & 1 & \phi \\ \phi^{n-1} & \phi^{n-2} & \phi^{n-3} & \cdots & \phi & 1 \\ \end{bmatrix}.$$

Taking the principal square root of the variance matrix gives the alternative form:

$$\mathbf{y} = \mathbf{\Delta}(\phi) \mathbf{z} \quad \quad \quad \mathbf{z} \sim \text{N}( \mathbf{0},\boldsymbol{I}) \quad \quad \quad \mathbf{\Delta}(\phi) = \mathbf{\Sigma}(\phi)^{1/2}.$$

(The matrix $\mathbf{\Delta}(\phi)$ can easily be obtained using the Eigendecomposition of the variance matrix. The variance matrix and its square root are symmetric circulant matrices, with fixed eigenvector matrix equal to the DFT matrix and eigenvalues that are functions of $\phi$.)

Distribution of the OLS estimator: For this model (clarifying your use of subscripts in the sums) the OLS estimator is:

$$\hat{\theta} = \frac{\sum_{t=2}^n y_t y_{t-1}}{\sum_{t=2}^n y_t^2}.$$

(Note that the OLS estimator for this model is not equal to the MLE estimator, since it effectively ignores a logarithmic term involving the variance matrix that is in the likelihood function.) To find the distribution of the OLS estimator, define the $n \times n$ matrix:

$$\mathbf{M}(r) \equiv \begin{bmatrix} 0 & 0 & 0 & \cdots & 0 & 0 \\ -1 & r & 0 & \cdots & 0 & 0 \\ 0 & -1 & r & \cdots & 0 & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & 0 & \cdots & r & 0 \\ 0 & 0 & 0 & \cdots & -1 & r \\ \end{bmatrix}.$$

It can be shown that the event $\hat{\theta} \leqslant r$ is equivalent to the event $\mathbb{y}^\text{T} M(r) \mathbb{y} \geqslant 0$, so we have:

$$\begin{equation} \begin{aligned} F_\hat{\theta}(r) \equiv \mathbb{P}(\hat{\theta} \leqslant r) &= \mathbb{P}(\mathbb{y}^\text{T} \mathbf{M}(r) \mathbb{y} \geqslant 0) \\[6pt] &= \mathbb{P}((\mathbf{\Delta}(\phi) \mathbf{z})^\text{T} \mathbf{M}(r) (\mathbf{\Delta}(\phi) \mathbf{z}) \geqslant 0) \\[6pt] &= \mathbb{P}(\mathbf{z}^\text{T} \mathbf{\Delta}(\phi) \mathbf{M}(r) \mathbf{\Delta}(\phi) \mathbf{z} \geqslant 0) \\[6pt] &= \mathbb{P}(\mathbf{z}^\text{T} \mathbf{H}(\phi, r) \mathbf{z} \geqslant 0), \\[6pt] \end{aligned} \end{equation}$$

where $\mathbf{H}(\phi, r) \equiv \mathbf{\Delta}(\phi) \mathbf{M}(r) \mathbf{\Delta}(\phi)$. Hence, the OLS estimator has the density function:

$$f_\hat{\theta}(r) = P_2(\phi,r) \quad \quad \quad P(\phi,r) \equiv \mathbb{P}(\mathbf{z}^\text{T} \mathbf{H}(\phi, r) \mathbf{z} \geqslant 0). $$

Thus, the expected values is $\mathbb{E}(\hat{\theta}) = \int \limits r P_2(\phi,r) dr$.

Thank you very much for this detailed answer Ben. It looks like it may be just what I need. It will take a while for me to work through it. I will respond further when I have managed to do that. — Andrew Kirk, May 11 '19 at 01:29
I think there is a problem here. Shouldn't you have $E(y_t y_{t+k}) = \frac{\sigma^2}{1- \phi^2} \phi^k$ as elements in $\Sigma(\phi)$? — Stéphane, Sep 20 '20 at 19:54

Calculating bias of ML estimate of AR(1) coefficient

1 Answers1

Linked