10

I have a design matrix of p regressors, n observations, and I am trying to compute the sample variance-covariance matrix of the parameters. I am trying to directly calculate it using svd.

I am using R, when I take svd of the design matrix, I get three components: a matrix $U$ which is $n \times p$, a matrix $D$ which is $1\times 3$ (presumably eigenvalues), and a matrix $V$ which is $3\times 3$. I diagonalized $D$, making it a $3\times 3$ matrix with 0's in the off-diagonals.

Supposedly, the formula for covariance is: $V D^2 V'$, however, the matrix does not match, nor is it even close to R's built in function, vcov. Does anyone have any advice/references? I admit that I am a bit unskilled in this area.

mpiktas
  • 33,140
  • 5
  • 82
  • 138
Will
  • 357
  • 2
  • 8

1 Answers1

15

First, recall that under assumptions of multivariate normality of the linear-regression model, we have that $$ \hat{\beta} \sim \mathcal{N}( \beta, \sigma^2 (X^T X)^{-1} ) . $$

Now, if $X = U D V^T$ where the right-hand side is the SVD of X, then we get that $X^T X = V D U^T U D V = V D^2 V^T$. Hence, $$ (X^T X)^{-1} = V D^{-2} V^T . $$

We're still missing the estimate of the variance, which is $$ \hat{\sigma}^2 = \frac{1}{n - p} (y^T y - \hat{\beta}^T X^T y) . $$

Though I haven't checked, hopefully vcov returns $\hat{\sigma}^2 V D^{-2} V^T$.

Note: You wrote $V D^2 V^T$, which is $X^T X$, but we need the inverse for the variance-covariance matrix. Also note that in $R$, to do this computation you need to do

vcov.matrix <- var.est * (v %*% d^(-2) %*% t(v))

observing that for matrix multiplication we use %*% instead of just *. var.est above is the estimate of the variance of the noise.

(Also, I've made the assumptions that $X$ is full-rank and $n \geq p$ throughout. If this is not the case, you'll have to make minor modifications to the above.)

cardinal
  • 24,973
  • 8
  • 94
  • 128
  • @Will, good. Glad it worked. You might consider accepting the answer then. Regards. – cardinal Jan 31 '11 at 03:58
  • I tried the equation but this doesn't quite work. http://stats.stackexchange.com/questions/195379/how-to-use-svd-for-linear-regression-script-provided – SmallChess Feb 13 '16 at 12:04