1

After a series of tests and exploratory analysis, I have design a linear model in R, the model is the following:

summary(gn)

Call:
lm(formula = NA. ~ I(PC^0.25) + I(((PI)^2)), data = DSET)

Residuals:
    Min      1Q  Median      3Q     Max 
-425.22  -87.46   -2.30   79.11  396.14 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) -2.047e+03  1.094e+02  -18.71  < 2e-16 ***
I(PC^0.25)   1.206e+03  4.231e+01   28.52  < 2e-16 ***
I(((PI)^2)) -5.242e-02  1.233e-02   -4.25 2.81e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 127.2 on 319 degrees of freedom
Multiple R-squared:  0.7475,    Adjusted R-squared:  0.746 
F-statistic: 472.3 on 2 and 319 DF,  p-value: < 2.2e-16

Then, for example, I calculate the interval confidence for a certain input:

> a<-data.frame(PC=58,PI=12)
> a<-predict(gn,newdata=a,interval="prediction",level=0.95)
> a
       fit      lwr      upr
1 1274.515 1022.976 1526.054

Finally, given the coefficients of the model and the same input, I try to calculate the same confidence interval in EXCEL. Using the formula for the 95% interval of each coefficient, <estimate coefficient - 1.96*error, estimate coefficient + 1.96*error>, the fitted value is reasonable but the extremes are far different.

So, the results are : fitted value=1274.499125 , lower=826.5811979 , upper=1722.417051. As you see, the extremes are different from the output of R.

What's my error? What should I do to get the same intervale I got in R?

CreamStat
  • 245
  • 1
  • 4
  • 10
  • 1
    I believe this question is answered at http://stats.stackexchange.com/questions/9131/obtaining-a-formula-for-prediction-limits-in-a-linear-model/9144#9144, which provides the correct formula for regression prediction intervals. Implement that and let us know how it compares to the `R` output. – whuber Jun 19 '14 at 16:30

0 Answers0