2

I'm reading through the ESL book and I'm on the part of ridge regression where the effective degrees of freedom are defined $$ df(\lambda) = tr(X(X'X + \lambda I)^{-1}X') = \sum_{j=1}^p{\frac{d_j^2}{d_j^2 + \lambda}} $$

I have no idea where this is coming from, what's the idea and intuition behind it.

Nikola
  • 121
  • 4
  • 2
    For the derivation of the terms in the sum see https://stats.stackexchange.com/a/220324/919. The interpretation of their sum as a "degrees of freedom" originates in the trace of the (usual) "hat matrix" $H.$ – whuber Jan 23 '21 at 18:44

0 Answers0