Little mess with heteroscedasticity in linear model in R

Question

I have a linear regression model:

model <- lm(data=df, var1~var2+var3+var4+var5+var6+var7)

Hypothesis about absence of heteroscedasticity is rejected as Breusch-Pagan test gives small p-value:

bptest(model)$p.value
#BP 
#1.014577e-06

But when I use robust estimations for parameters of the model:

library("sandwich")
coeftest(model, vcov. = vcovHC(model))

... for all parameters the value Pr(>|t|) is reducing. So the coefficients seem to become more significant although robust estimators usualy do vice versa.

Would you explain the matter of why this happens? Or robust estimators don't reduce statistical significance of parameters for some special cases? Would you mention these cases?

Thank you.

See [this](http://stats.stackexchange.com/questions/143881/why-do-autocorrelation-and-heteroskedasticity-under-report-the-sample-variance-o/143889#comment275387_143889) discussion. — Christoph Hanck, Apr 07 '15 at 03:31

score 4 · Accepted Answer · answered Apr 07 '15 at 04:47

It's common, but it's not necessarily the case that the standard error always increases (and hence the p-value goes up).

Here are three regression models. In the first, the smaller group has the larger variance. The corrected standard error is larger than the uncorrected. In the second, the larger group has the larger variance. The standard error (and hence the p-value) shrinks. In the third, the groups are equal in size, the standard error hardly changes (but it increases a little).

> x <- c(rep(0, 10), rep(1, 90))
> y <- c(scale(runif(10) )* 10 + 1, scale(runif(90))) 
> m1 <- (lm(y ~ x))
> summary(m1)

Call:
lm(formula = y ~ x)

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)    1.000      1.005   0.995    0.322
x             -1.000      1.059  -0.944    0.347


> 
> coeftest(m1, vcov. = vcovHC(m1))

t test of coefficients:

            Estimate Std. Error t value Pr(>|t|)
(Intercept)   1.0000     3.3333  0.3000   0.7648
x            -1.0000     3.3350 -0.2998   0.7649

> 
> 
> x <- c(rep(0, 90), rep(1, 10))
> y <- c(scale(runif(90))*10 + 1, scale(runif(10))) 
> m2 <- (lm(y ~ x))
> summary(m2)

Call:
lm(formula = y ~ x)


Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)    1.000      1.005   0.995    0.322
x             -1.000      3.178  -0.315    0.754

> coeftest(m2, vcov. = vcovHC(m2))

t test of coefficients:

            Estimate Std. Error t value Pr(>|t|)
(Intercept)   1.0000     1.0600  0.9434   0.3478
x            -1.0000     1.1112 -0.8999   0.3704

> 
> 
> x <- c(rep(0, 50), rep(1, 50))
> y <- c(scale(runif(50))*10 + 1, scale(runif(50))) 
> m3 <- (lm(y ~ x))
> summary(m3)

Call:
lm(formula = y ~ x)


Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)    1.000      1.005   0.995    0.322
x             -1.000      1.421  -0.704    0.483

Residual standard error: 7.106 on 98 degrees of freedom
Multiple R-squared:  0.005026,  Adjusted R-squared:  -0.005127 
F-statistic: 0.495 on 1 and 98 DF,  p-value: 0.4834

> coeftest(m3, vcov. = vcovHC(m3))

t test of coefficients:

            Estimate Std. Error t value Pr(>|t|)
(Intercept)   1.0000     1.4286  0.7000   0.4856
x            -1.0000     1.4357 -0.6965   0.4877

Little mess with heteroscedasticity in linear model in R

1 Answers1