0

In section 3.2.3 of Elements of Statistical Learning (Link), there's this statement on multiple regression coefficients on Page 54

we have shown that the $j^{th}$ multiple regression coefficient is the univariate regression coefficient of $\mathbf{y}$ on $\mathbf{x_{j·012...(j−1)(j+1)...,p}}$, residual after regressing $\mathbf{x}_j$ on $\mathbf{x}_0, \ldots, \mathbf{x_{j-1}}, \mathbf{x}_{j+1}, \ldots, \mathbf{x}_p$

If I go by the definition of "regress $\mathbf{b}$ on $\mathbf{a}$" (introduced on Page 53), the statement above would mean that $j^{th}$ multiple regression coefficient is given by

$$ \hat{\beta}_j = \frac{<\mathbf{x}_j, \mathbf{r} >}{<\mathbf{r}, \mathbf{r}>} $$ where $\mathbf{r} = \mathbf{x}_j - \frac{<\mathbf{x}_j, \mathbf{x}_0>}{<\mathbf{x}_0, \mathbf{x}_0>} \mathbf{x}_0 - \ldots - \frac{<\mathbf{x}_j, \mathbf{x}_{j-1}>}{<\mathbf{x}_{j-1}, \mathbf{x}_{j-1}>} \mathbf{x}_{j-1} - \frac{<\mathbf{x}_{j}, \mathbf{x}_{j+1}>}{<\mathbf{x}_{j+1}, \mathbf{x}_{j+1}>} \mathbf{x}_{j+1} - \ldots - \frac{<\mathbf{x}_{j}, \mathbf{x}_{p}>}{<\mathbf{x}_{p}, \mathbf{x}_{p}>} \mathbf{x}_{p} $.

Edit: I understand that the expression for $\hat{\beta}_j$ is wrong. However, I would like to understand what am I interpretig wrong in the notation "residual after regressing $\mathbf{x}_j$ on $\mathbf{x}_0, \ldots, \mathbf{x_{j-1}}, \mathbf{x}_{j+1}, \ldots, \mathbf{x}_p$"?

ethelion
  • 27
  • 5
  • See https://stats.stackexchange.com/a/46508/919 and https://stats.stackexchange.com/questions/17336, *inter alia.* I hope they clarify how your formula isn't quite right. – whuber Mar 22 '21 at 17:39
  • I had the feeling that it's not right. Trying to understand what the statement I quoted from the textbook means instead. Specifically, what would "residual after regressing $ \mathbf{}_$ on $\mathbf{}_0,…,\mathbf{}_{−1}, \mathbf{}_{+1},…,\mathbf{}_{}$" look like as an expression if not what I wrote. – ethelion Mar 22 '21 at 17:46
  • 1
    The notation means you have to perform a multivariate regression of $x_j$ against all the other explanatory variables, then replace $x_j$ by its residuals: that is, you need to "take out" the components of all the other variables from $x_j.$ Your notation describes $p-1$ separate *univariate* regressions of $x_j$ against each other variable. – whuber Mar 22 '21 at 18:29
  • @whuber This makes sense and answers my question. I think I should update the question to reflect that I was confused more by the notation and not so much by what the regression coefficient will look like. Also, I would like to mark your comment above as the answer. How do I do that? – ethelion Mar 23 '21 at 00:50

0 Answers0