I am interested in understanding the graph plots we get after running lm()
command (for linear regression) in R like, for example
lm.mod1 = lm(y ~ x1 + x2)
I then get the do the summary by:
summary(lm.mod1)
I get the result as:
Residuals:
Min 1Q Median 3Q Max
-750.32 -160.54 -49.83 115.83 2923.74
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -345.1552 37.0393 -9.319 <2e-16 ***
x1 52.9091 2.4929 21.224 <2e-16 ***
x2 8.9669 0.5395 16.620 <2e-16 ***
Residual standard error: 274.4 on 1985 degrees of freedom
Multiple R-squared: 0.2059, Adjusted R-squared: 0.2051
F-statistic: 257.3 on 2 and 1985 DF, p-value: < 2.2e-16
I then do the plotting by
par(mfrow = c(2,2))
plot(lm.mod1)
I get 4 graphs (I can't post the graphs since I am a new user and my experience level is below 10. :/)
My questions are :
How do they calculate F-statistics and t-value?
Could someone explain me the what do we interpret with the last two graphs i.e. $\text{Scale-Location vs. (Standardized residuals)}^{1/2}$ and $\text{Residuals vs. Leverage}$. What do you mean by Leverage?
What do you mean by Cook's Distance? I saw it on wikipedia but I didnt get it.
How could we suggest if our model is a good model or not?