In trying to answer a question here on Cross Validated, I was re-reading Section 3.2.3, specifically Algorithm 3.1 from Elements of Statistical Learning.
What I followed from this is that, given a model with one dependent variable and two independent variables,
$Y = \beta_0 + \beta_1X_1 + \beta_2X_2 + \epsilon$
then the estimated regression coefficients, let's say for example $\hat{\beta_1}$ in the equation above, would be the same $\hat{\beta_1}$ from this equation:
$z_2 = \beta_0 + \beta_1X_1 + \epsilon$,
where
$z_2 = Y - \beta_0 - \beta_2X_2$
That is: It is my understanding that the regression coefficient $\hat{\beta_j}$ one gets in a multivariate linear regression is equal to the coefficient you would get if you took the residuals from the model where the dependent variable is regressed on all other predictor variables (besides $X_j$) and regressed those on $X_j$. However, I simulated some data and did not get this:
set.seed(1839) # setting seed
x1 <- rnorm(200, 0, 1) # generating x1
x2 <- x1 + rnorm(200, 1, 3) # generating a correlated x2
eps <- rnorm(200, 0, 6) # generating error
y <- x1 + x2 + eps # making y
fit <- lm(y ~ x1 + x2) # fitting overall model
Looking at the summary, we can see the coefficients for x1
and x2
:
summary(fit) # looking at summary
Call:
lm(formula = y ~ x1 + x2)
Residuals:
Min 1Q Median 3Q Max
-18.4923 -4.4054 0.2954 4.0371 15.0697
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.5802 0.4462 -1.300 0.195
x1 0.1487 0.4733 0.314 0.754
x2 1.1272 0.1397 8.071 6.77e-14 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 6.021 on 197 degrees of freedom
Multiple R-squared: 0.2706, Adjusted R-squared: 0.2631
F-statistic: 36.53 on 2 and 197 DF, p-value: 3.199e-14
Now, why don't these coefficient's match these found in the equations below?
coef(lm(residuals(lm(y ~ x2)) ~ x1))[2] # not exactly equal to the x1 in the original fit
x1
0.1358402
coef(lm(residuals(lm(y ~ x1)) ~ x2))[2] # not exactly equal to the x2 in the original fit
x2
1.029427
Why are these x1
and x2
coefficients not the same as those above? They are close—is this due to rounding? Or am I missing something from Algorithm 3.1?