What problem do regularization methods solve? I thought it was feature selection and to prevent overfitting. However, I was informed that the reason Ridge, Lasso, and Elastic Net were created in the first place was "to deal with colinearity." However, I am not readily finding anything online to support this.
Wikipedia says, "Regularization, in mathematics and statistics and particularly in the fields of machine learning and inverse problems, is a process of introducing additional information in order to solve an ill-posed problem or to prevent overfitting."
Where does the linear dependence of columns come into play in regularization? For example, if the columns exhibit multicollinearity, how does regularization know which features to select and which to reduce towards zero?