OLS regression with Newey-West error term

Question

I need to estimate a linear regression with the OLS method. Since I assume the error terms to be correlated, I would like to account for heteroskedasticity and autocorrelation in the error terms. I already wrote some code, but I wonder whether it is correct.

So far, my code is:

myregression <- lm(x ~ y + z)
coeftest(myregression, vcov=NeweyWest(myregression))

Any help is highly appreciated!

Presumably you are using the sandwich and lmtest packages. You can type ?coeftest and ?NeweyWest for examples (at the end of the help page). I personally prefer dynlm objects for timeseries type data, in case that's where your autocorrelation comes from... — P.Windridge, Jan 04 '17 at 16:47
Thank you very much P.Windridge! I am using these packages. And I allready wrote the instructions. However, since it is very important that I do not make any mistake, I decided to ask for help here. As far as I understand you would rather use myregression — user144267, Jan 04 '17 at 16:52
If your question is only about R code, it is off topic here. If you have a statistical question about autocorrelation or sandwich estimators, please edit to clarify. — gung - Reinstate Monica, Jan 04 '17 at 17:41
Be aware that the `NeweyWest` function performs a procedure called prewhitening before actually computing the HAC standard errors. This is done in order to "increase the performance" of the HAC algorithm and might be a good idea in your case, but you should be aware that this is done by default (you can turn it off though). See [here](https://stats.stackexchange.com/q/312297/182174) for a related question and [here](https://www.rdocumentation.org/packages/sandwich/versions/2.4-0/topics/NeweyWest) for the documentation of the `NeweyWest` function. — Candamir, Nov 08 '17 at 21:18

P.Windridge · Answer 1 · 2017-01-04T19:52:13.417

This question is about code but seeing as I've been looking at HAC estimates recently in R I will "answer".

I have not checked the R implementation of Newey-West is exactly as in their original paper. However, your code does indeed calculate R's NeweyWest HAC estimate using the default bandwidth selection/lag method. (You can view this parameter with the "verbose=T" option.)
If you know the form of the correlations in your data then you can take less of a "sledgehammer" approach than Newey-West. E.g. Prais-Winsten or Cochrane-Orcutt.
Be aware that serial correlation is being examined here and so the order that your observations are sorted in does matter.

If you are interested you might consider a toy example where you generate correlated residuals on purpose to see how the Newey-West std error estimates/p-values differ. In the example below the correlated residuals make it "look like" the response can be fitted against a straight line. The lag parameter is automatically (and correctly) chosen = 1 (seen with verbose=T option). If you play with the process generating the residuals you can see how it changes.

set.seed(04012017)
n<-50
correlated_residuals<-arima.sim(list(ar = .9), n)
y<-correlated_residuals
x<-1:n
plot(x,correlated_residuals)
fit<-lm(y~x)
abline(fit)
summary(fit) # standard estimates
coeftest(fit,vcov=NeweyWest(fit,verbose=T))

Why do you say Newey-West is like a sledgehammer? – badmax Jan 22 '19 at 05:06 — badmax, Jan 22 '19 at 05:06

OLS regression with Newey-West error term

1 Answers1

Linked