How can I handle autocorrelated residuals?

Question

In the Breusch-Godfrey test we use a model

$$ e_t = \varepsilon_t + \beta_1 \varepsilon_{t-1} + \dots+ \beta_p \varepsilon_{t-p}. $$

If we reject the null hypotesis of no serial auotocorrelation of the error, it means that the residuals follow an auto-regressive model of order ($p$).

If I want to avoid this problem, I must add a certain number of lags of the response variable $y$ as regressors in the original model. However, in some cases this method is not useful.

Are there other options to consider?

Related threads: ["Residual autocorrelation versus lagged dependent variable"](http://stats.stackexchange.com/questions/110757/residual-autocorrelation-versus-lagged-dependent-variable), ["When is it necessary to include the lag of the dependent variable in a regression model and which lag?"](http://stats.stackexchange.com/questions/69570/when-is-it-necessary-to-include-the-lag-of-the-dependent-variable-in-a-regressio) and ["Inclusion of lagged dependent variable in regression"](http://stats.stackexchange.com/questions/52458/inclusion-of-lagged-dependent-variable-in-regression). — Richard Hardy, Jul 30 '16 at 19:52

score 8 · Accepted Answer · edited Jun 11 '20 at 14:32

If your regression-type model has serially correlated residuals, as a remedy you may include lags of the dependent variable as regressors, just as you mentioned. However, you might wish to preserve the original model for convenience of interpretation, direct representation of a theoretical model or other reasons. In such case, you have three options:

Do what TPArrow suggested, i.e. keep the original model but allow the model errors to follow an AR process, and use penalized estimation. This way you get a penalized regression with AR errors.
Keep the original model but allow the model errors to follow an ARMA (or more generally, SARIMA) process. This way you get a regression with ARMA errors.
Leave the model specification intact and use heteroskedasticity and autocorrelation (HAC) robust standard errors.

Let us examine the options in more detail:

1.
See the answer by TPArrow.

2.
The model can be estimated using functions arima ("stats" package) or auto.arima ("forecast" package) in R. You enter the regressors via the argument xreg and select the autoregressive and moving-average lag orders either manually (with arima) or automatically (with auto.arima).

Comparing 1. with 2., the question is whether

not allowing for moving average components in the error process but using penalization OR
allowing for moving average components but not using penalization

works better for your particular example. I expect none of the two approaches to be uniformly better, so you could try both and see which gives better results. This could be evaluated, for example, by estimating the models on part of the original sample and examining their performance on the remaining part.

3.

Using HAC-robust standard errors may appear convenient but need not be the best option. Francis X. Diebold warns against that in his blog posts "The HAC Emperor has no Clothes" and "The HAC Emperor has no Clothes: Part 2" (and I am with him, if my voice counts):

Punting via kernel-HAC estimation is a bad idea in time series, for several reasons:

(1) Kernel-HAC is not likely to produce good $\beta$ estimates [and that is important is not-so-large samples]. <...>

(2) Kernel-HAC is not likely to produce good $\beta$ inference [because] <...> kernel-HAC standard errors may be unnecessarily unreliable in small samples, even if they're accurate asymptotically.

(3) Most crucially, kernel-HAC fails to capture invaluable predictive information. <...>

The clearly preferable approach is traditional parametric disturbance heteroskedasticty / autocorrelation modeling, with GLS/ML estimation. Simply allow for ARMA(p,q)-GARCH(P,Q) disturbances (say), with p, q, P and Q selected by AIC (say). (In many applications something like AR(3)-GARCH(1,1) or ARMA(1,1)-GARCH(1,1) would be more than adequate.)

(I encourage you to read the entire posts. They are quite short, very accessible and (last, but not the least) authored by a respected time series econometrician.)

Thank you Richard Hardy, i will prove all the steps that you suggest me. I think that in my case, the auto.arima for the residuals can be useful. — Enzo D'Innocenzo, Aug 01 '16 at 08:11
@RichardHardy Yes! but the sistem don't let me to upvote your answer because i've got less than 15 rep. — Enzo D'Innocenzo, Aug 01 '16 at 08:44
@EnzoD'Innocenzo, fine, no problem. It was just a general remark. — Richard Hardy, Aug 01 '16 at 09:10
Could you elaborate why it is a remedy to include the lagged dependent variable as a regressor? — J3lackkyy, Jun 22 '21 at 15:08
@J3lackkyy, this tends to remove autocorrelation. E.g. imagine a model that has an omitted lagged dependent variable. Its residuals will be autocorrelated. By including the omitted lag, the problem gets solved. — Richard Hardy, Jun 22 '21 at 17:07
@richardHardy I understand "tends" to remove autocorrelation as an omitted lagged dependent variable is one possibility. But calling it "Remedy" was implying something general for me. Thanks for the enlightment — J3lackkyy, Jun 22 '21 at 17:19
@J3lackkyy, I suppose you may find a more general description of this technique in some textbook, and I do not think it would be too pretentious to call it a remedy. But it has been a while since I looked at this, and I do not remember the details any longer. — Richard Hardy, Jun 22 '21 at 18:00

TPArrow · Answer 2 · 2016-07-29T14:00:52.760

The actual answer is to add some orders of residuals as they are actually autocorrelated.

Then, given you are in linear space, the problem reduces to, \begin{align} & y_t=X_t\beta+\epsilon_t,\\ & \epsilon_t=\theta_1 \epsilon_{t-1}+\ldots+\theta_p \epsilon_{t-p}+e_t \end{align} where $e_t\sim i.i.d N(0,\sigma^2<\infty)$. Now a penalized likelihood will do the order selection. More preciesly you should estimate parameters under $l_1$ penalized likelihood. Fortunetly it is already done, see DREGAR package in R click here.

Update: From your sentence, "i must add a certain number of lags of the response variable y" adding lags of response leads to the different scenario than what you have listed in your equation. Both are implemented in DREGAR package.

How can I handle autocorrelated residuals?

2 Answers2

Linked

Related