3

There were many similar questions on this site , related to this but none were exactly to the point I wanted to ask

So the question is relates to ridge regression and This link where there is a statement(in bold) that

In a least squares fit where coefficient estimate given by

has no unique solution when is not invertible but

Inclusion of λ makes problem non-singular even if is not invertible

This was the original motivation for ridge regression (Hoerl and Kennard, 1970)

ie.

I am unable to exactly convince myself why this would be the case. Specifically for the case when p(predictors)>n(instances) how will I be able to obtain a unique solution with this?

GeneX
  • 622
  • 7
  • 14

1 Answers1

3

Full rank matrix is invertible. Adding $\lambda I$ to $X^TX$ make it a full rank matrix.

Here is an example. Suppose $X$ has two identical columns:

> x=cbind(c(1,1,1,1),c(1,1,1,1),c(1,2,3,4))
> x
     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    1    1    2
[3,]    1    1    3
[4,]    1    1    4
> t(x) %*% x
     [,1] [,2] [,3]
[1,]    4    4   10
[2,]    4    4   10
[3,]   10   10   30
> Matrix::rankMatrix(t(x) %*% x)
[1] 2   
> Matrix::rankMatrix(t(x) %*% x + 1e6*diag(3))
[1] 3

$X^TX$ is a symmetric matrix. It is diagonalizable with real eigenvalues. Let those eigenvalues equal $\lambda_1, \cdots \lambda_n$.

If we choose any of the $\lambda_i$ above. $(X^TX - \lambda_i I)$ will create a singular matrix.

However, $(X^TX - \lambda I)$ where $\lambda \ne \lambda_1, \cdots \lambda_n$ then the eigenvalues of $(X^TX - \lambda I)$ equal $\lambda_1 - \lambda,\cdots,\lambda_n-\lambda$ none of which are equal to $0.$ Hence, $(X^TX - \lambda I)$ is non-singular.

Nearly any perturbation to a singular matrix will make for a non-singular matrix.

Haitao Du
  • 32,885
  • 17
  • 118
  • 213