scoring metric for regression that does not weight outliers heavily

Question

I'm using the root mean squared error (RMSE) as a metric for tuning the parameters of my model in a regression problem through cross-validation. However, I'm not so much interested that all predictions are good, I want that about 20% or 40% percent of my predictions are "spot-on" and don't care if the other 80% or 60% are garbage.

What metric would be best for this?

Look at the entire distribution of residuals. This should be directly available in any decent software after regression. No omnibus statistic can be anything more than a particular summary. Unfortunately, you should care if 60% or 80% of your predictions are garbage; that may mean that you are fitting an inappropriate model and that may mean that the model is not to be trusted any way. — Nick Cox, Oct 30 '15 at 10:12

score 0 · Answer 1 · edited Apr 13 '17 at 12:44

0

Answering my own question here, it seems that the root mean squared log error (RMSLE) is a suitable metric, see this CV post

edited Apr 13 '17 at 12:44

Community

1

answered Oct 30 '15 at 10:01

spore234

1,323
1
15
31

2

This can't be a general answer without a rationale. – Nick Cox Oct 30 '15 at 10:14

scoring metric for regression that does not weight outliers heavily

1 Answers1