The variance matrix of the unique solution to linear regression

Asked Feb 27 '21 at 05:26

Active Feb 27 '21 at 16:55

Viewed 26 times

By minimizing the mse, we get that the unique solution to this optimizing problem is $\beta = (X^TX)^{-1}X^Ty$. Why is its variance matrix $Var(\beta)=(X^TX)^{-1}\sigma^{2}$, where we assume that the observations $y_{i}$ have constant variance $\sigma^{2}$ ?

edited Feb 27 '21 at 07:03

asked Feb 27 '21 at 05:26

XXX

2

take a look at page 5 of http://cs229.stanford.edu/summer2020/BiasVarianceAnalysis.pdf. But you can ignore the $\lambda I$ term if you want because that becomes 0 if you're not doing ridge regression. The key idea is that constants 'factor out' of variance or covariance, whether it be a constant scalar times a one-dimensional random variable, or a constant matrix times a random vector ($y$ in your case). In linear regression, the matrix $X$ is considered constant, or you could consider it implicitly conditional, $Var(\beta | X)$. – MathFoliage Feb 27 '21 at 10:40
@MathFoliage Thank you very much for the cs229 file and the clarification. – XXX Feb 27 '21 at 12:28

The variance matrix of the unique solution to linear regression

0 Answers0