2

The AIC is an approximately unbiased estimator of the (relative) risk of the Kullback-Leibler loss. I read that If you use AIC to choose among a family of models, AIC (approximately) yields the model with minimal expected MSE on a test set under hypothetically infinite resampling. I don't understand the connection between KL-divergence and expected MSE. When is the model with minimal expected MSE the same as the model with minimal expected KL-divergence?

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
Joe_base
  • 105
  • 8
  • 2
    AIC (approximately) yields the model with maximal expected likelihood, not minimal expected MSE. The question is, when do the two coincide (exactly or approximately). I had some related questions before: ["Optimality of AIC w.r.t. loss functions used for evaluation"](https://stats.stackexchange.com/questions/425675), ["Equivalence of AIC and LOOCV under mismatched loss functions"](https://stats.stackexchange.com/questions/406430). Yours might even be a duplicate of the first one. – Richard Hardy Jun 10 '20 at 13:57
  • 3
    Does this answer your question? [Optimality of AIC w.r.t. loss functions used for evaluation](https://stats.stackexchange.com/questions/425675/optimality-of-aic-w-r-t-loss-functions-used-for-evaluation) – Richard Hardy Jun 10 '20 at 14:01
  • Thank you. They are very helpful, although i still have trouble to come up with an accessible (toy) example. – Joe_base Jun 11 '20 at 09:07

0 Answers0