Basic questions concerning the interpretation of results from summary(lm(...~...)) in R

Question

set.seed(11)
a = runif (12)
b = rep(c(1,2,3),4)
summary(lm(a~b))$coeff
summary(lm(a~b-1))$coeff

What does a p.value for the intercept means ?

What differences makes the second formula ?

Why is the p.value lower and the R squared higher in the second formula ?

What does the first column ("Estimate") means ?

Thanks a lot for your help.

I find 848 posts (so far) related to interpreting regression coefficients. A highly voted one that covers all aspects of your question is at http://stats.stackexchange.com/questions/5135/interpretation-of-rs-lm-output. — whuber, May 14 '13 at 19:06

score 1 · Accepted Answer · answered May 14 '13 at 16:30

1

p.value for the intercept is a test of whether the parameter is equal to zero.
The second formula is a regression that excludes the intercept. This is forcing the model to go through point (0,0) blue line in the figure below, there is a fair amount of information on this site of whether or not it is appropriate to model without the intercept.
"Estimate" refers to the regression parameters that are being calculated by the models. In the first model it includes the intercept and slope, in the second model it is just the slope.

The model form is $Y = mx + b$ where $m$ is the slope and $b$ is the intercept.

Regression lines

answered May 14 '13 at 16:30

B Williams

Thank you ! Still one thing which is unclear to me. Why is the p.value lower and the R squared higher when forcing the regression to pass through the point (0,0) ? – Remi.b May 14 '13 at 18:37
This may help you http://stats.stackexchange.com/questions/47527/use-squared-correlation-in-regression-without-intercept – B Williams May 14 '13 at 19:06

1 Answers1