2

1) So I know that $h_{ii}$ is just the ith row ith column of $H=X(X^TX)^{-1}X^T$. Intuitively, why is this the case? I understand that H is the projection matrix and leverage is measuring how far away an observation is from other observations. I also understand it when you look at the formula in the case of simple linear regression: $h_{ii}=\frac{1}{n}+\frac{(x_i-\bar{x})^2}{\sum_{j=1}^n(x_j-\bar{x})^2}$. But I don't understand how $X_i^T(X^TX)^{-1}X_i$ measures leverage.

2) Also, when trying to derive $h_{ii}$ for SLR, I'm getting $\frac{\sum x_j^2-2x_i\sum x_j+x_i^2n}{n\sum x_j^2-(\sum x_j)^2}$ which I can't simplify into the previous formula, so I assume I did it wrong. I used $H=X(X^TX)^{-1}X^T$ where $X=\left[\begin{array} {cc} 1 & x_1 \\ 1 & x_2 \\ ... & ... \\ 1 & x_n \end{array}\right]$ and ended up with $H=\frac{1}{n\sum x_i^2-(\sum x_i)^2} \left[\begin{array} {cccc} \sum x_i^2-2x_1\sum x_i+x_1^2n & \sum x_i^2-x_2\sum x_i-x_1\sum x_i+x_2x_1n & ... & \sum x_i^2-x_n\sum x_i-x_1\sum x_i+x_nx_in \\ \sum x_i^2-x_1\sum x_i-x_2\sum x_i+x_1x_2n & \sum x_i^2-2x_2\sum x_i+x_2^2n & ... & ... \\ ... & ... & ... & ... \\ \sum x_i^2-x_1\sum x_i-x_n\sum x_i+x_1x_nn & ... & ... & \sum x_i^2-2x_n\sum x_i+x_n^2n \end{array}\right]$ I had $(X^TX)^{-1}=\frac{1}{n\sum x_i^2-(\sum x_i)^2} \left[\begin{array} {cc} \sum x_i^2 & -\sum x_i \\ -\sum x_i & n \end{array}\right]$
Anywhere obvious where I went wrong?

Brian
  • 21
  • 3

1 Answers1

1

I think you would be much better off by using matrix algebra throughout.

As for your first question, remember that the fits in a linear model are obtained as:

$$\hat{y} = H y$$

Intuitively, $h_{ii}=1$ means that observation $y_i$ fully determines $\hat{y}_{i}$, so in a sense it has maximum leverage. If $h_{ii} \approx 0$, that would imply that observation $y_i$ has very little role in determining $\hat{y}_{i}$ which would be mostly determined by the rest of the observations.

F. Tusell
  • 7,733
  • 19
  • 34
  • I am trying to grasp the intuition you mentioned that $h_{ii}=1$ Are you suggesting that if all the diagonals of the matrix is 1 then the off-diagonal is 0 and we have perfect correlation of 1? – GENIVI-LEARNER Apr 18 '20 at 15:35
  • @GENIVI-LEARNER, if $h_{ii}=1$ are all ones, indeed the rest of the matrix is made of zeroes (can be shown) and $\hat{y}_i$ "copies" $y_i$. You have maximum leverage in the sense that no matter what the other observations are, observation $y_i$ is exactly fitted. Yes, you would have perfect correlation between the $y_i$'s and the $\hat{y}_i$'s, and even more than that, identity. – F. Tusell Apr 18 '20 at 20:31