I am testing various techniques for dealing with strong multi-collinearity (MC) in a regression problem.
There have been various comparison papers written between competing techniques such as Ridge Regression (RR) and Principal Components Regression (PCR). There seems to be no clear winner though with the best technique seemingly problem specific. However one thing that bothers me about the PCR approach is the somewhat arbitrary way in which one simply excludes the smallest eigenvectors as has been proven in Hadi and Ling even the smallest eigenvector may have strong predictive power while the largest eigenvectors may have none.
"Some Cautionary notes on the use of Principal Components Regression" by Hadi and Ling. (PDF)
They also show that the the SSE can be vastly improved by adding seemingly insignificant eigenvectors.
In their discussion they highlght two papers that try to address this 2nd deficiency--Lott(1973) and Gunst and Mason(1973)--but it has been shown that the Lott technique fails to pick the "correct" eigenvectors in the presence of strong MC, and my problem has strong MC.
Do you know of a paper that can select the optimum set of eigenvalues even in the presence of strong MC? Or more recent papers that compare PCR and RR?