In this lecture by Hastie and Tibshirani, it is mentioned that with case-control samples, we can estimate the logistic regression parameters accurately, but the intercept term is incorrect. It also gives a formula to correct the intercept term.
Could you please mathematically explain why this happens?
Additionally, could you please explain why exactly case-control sampling is necessary, and which methods of classification are particularly sensitive to imbalanced priors?