Confidence interval of linear regression variable calculation

Question

I want to be able to calculate the confidence interval from the estimated coefficient and respective standard errors.

I have a linear regression model which can be summarized (in R):

summary(fit1)

Call:
lm(formula = bwt ~ height + weight + parity, data = data)

Residuals:
    Min      1Q  Median      3Q     Max 
-66.913 -10.624   0.991  10.979  55.621 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 31.19217   13.56879   2.299   0.0217 *  
height       1.24964    0.23083   5.414 7.48e-08 ***
weight       0.06781    0.02823   2.402   0.0164 *  
parity1     -1.83309    1.19838  -1.530   0.1264    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 17.9 on 1170 degrees of freedom
Multiple R-squared:  0.04898,   Adjusted R-squared:  0.04654 
F-statistic: 20.08 on 3 and 1170 DF,  p-value: 1.071e-12

With this model I can calculate the respective confidence intervals:

> confint(fit1)
                  2.5 %     97.5 %
(Intercept)  4.57029503 57.8140351
height       0.79676227  1.7025207
weight       0.01243198  0.1231932
parity1     -4.18429933  0.5181151

I would expect the intervals of the predictor height to be given by

$$ 1.24964 \pm (1.96*0.23083) = [0.7972132,1.702067] $$

where 1.24964 is the estimated value for the coefficient and 0.23083 is the standard error for this coefficient. The numbers are close but not quite the same.

What am I doing wrong?

You are using 1.96, which is an approximation to the normal distribution. But recall that since we don't know the true variance of our error terms, we must use a students T distribution. This is almost (but not quite) the same; the differences vanish as sample size grows. — Andreas, Feb 06 '16 at 01:33
Right. Check this out: `fit = lm(mpg ~ wt, mtcars)`; `coef=summary(fit)$coefficients[2,1]`; `err=summary(fit)$coefficients[2,2] `;`coef + c(-1,1)*err * qt(0.975, 30)`; `confint(fit, 'wt', level=0.95)` — Antoni Parellada, Feb 06 '16 at 01:43
Thanks Andreas. You are right. Using $$qt(p=0.975, df=1170)$$ instead of 1.96 works. — JC1, Feb 06 '16 at 01:45
Pretty much a duplicate, for example of [this question](http://stats.stackexchange.com/questions/29981/should-confidence-intervals-for-linear-regression-coefficients-be-based-on-the-n). Numerous other relevant answers can be found. You might find it helpful to read about [pivotal quantities](https://en.wikipedia.org/wiki/Pivotal_quantity). A number of posts here discuss obtaining confidence intervals from pivotal quantities. — Glen_b, Feb 06 '16 at 02:28

Confidence interval of linear regression variable calculation

0 Answers0

Related