1

I've been working on understanding how R calculates the standard error it reports in the summary() of a regression generated using lm() like:

x <- c(1:20)

y1 <- 3 * x + rnorm(length(x), sd = 0.5)
r1 <- lm(y1 ~ x)
s1 <- summary(r1)
s1$coefficients

and came across the s1$cov.unscaled matrix, which is calculated from the design matrix of the regression like:

X <- matrix(c(rep(1, length(x)), x), ncol = 2)
M <- solve(t(X) %*% X)

where M == s1$cov.unscaled. Since it only depends on x, M is also the same as s2$cov.unscaled in:

y2 <- 3 * x + rnorm(length(x), sd = 5)
r2 <- lm(y2 ~ x)
s2 <- summary(r2)

where the residual error of the regression is much higher. Together with the residual standard error s1$sigma, M can then be used to calculate the standard error of the parameter estimates as reported in s1$coefficients:

sqrt(diag(M) * s1$sigma^2)

The meaning of the standard error of a parameter estimate is quite intuitive, and it also makes sense that an estimate of how well the regression fits the data (like s1$sigma) would be included in its calculation (i.e. $\widehat{\sigma}^2 (X^{\top}X)^{-1}$). I am having trouble however in understanding intuitively the meaning of s1$cov.unscaled (or more generally $(X^{\top}X)^{-1}$), which only depends on x, and not on y1 or y2. Is there an intuitive meaning of this construct?

I'd be very grateful for any clarifications or reading suggestions on this matter!

clw
  • 41
  • 5
  • see also this question for answers: https://stats.stackexchange.com/questions/267948/intuitive-explanation-of-the-xtx-1-term-in-the-variance-of-least-square – clw May 08 '19 at 16:23

0 Answers0