Applying the normal equation to a ridge regression proof?

Question

A previous answer to a question asking for a derivation of ridge regression points out at one juncture that from the following equation:

$$(y_∗−X_∗β)′(y_∗−X_∗β)=(y−Xβ)′(y−Xβ)+λβ′β$$

It follows that

$$(X′_∗X_∗)β=X′_∗y_∗$$

The original author states that "From the form of the left hand expression it is immediate that the Normal equations are...". I do not understand why this follows, and would like to know more on the matter.

score 3 · Accepted Answer · answered Oct 29 '17 at 19:41

3

This is by direct analogy with the normal equations of linear regression

$$(y−Xβ)′(y−Xβ) = 0 \Rightarrow X′X\beta = X′y $$

The poster is taking the above implication as known. By amending the matrix $X$ with the fake ridge observations we arrive at the ridge normal equations

$$(y_∗−X_∗β)′(y_∗−X_∗β) = 0$$

Which, by analogy with the above, must have the solutions

$$ X′_* X_* \beta = X′_* y $$

answered Oct 29 '17 at 19:41

Matthew Drury

33,314
2
101
132

Let me gauge if I understand. So given a user-selected $\lambda$, minimization is equivalent to finding a $\beta$ such that the right-hand side (of my equation) is 0. Then the third equation follows from the corollary (the first equation?). – Aleksey Bilogur Oct 29 '17 at 20:12
1

Yup. That seems correct to me. – Matthew Drury Oct 29 '17 at 20:16

Applying the normal equation to a ridge regression proof?

1 Answers1