For a numerical outcome, is thresholding first and classification the same as regression and thresholding after the prediction?

Asked Nov 21 '16 at 18:29

Active Nov 29 '16 at 22:24

Viewed 77 times

Assuming we wanted not to predict someone's income but rather simply if that person makes more or less than a given amount (say $50k). (When) would it be better to approach this as a classification problem ("0=less than 50k, 1=more than 50k") or as regression problem? That is, would it make more sense to bin the income and then perform a classification, or perform a regression and bin the result?

There has been a question about whether every classification problem can be approached as a regression problem (here). I'd be interested if the answer is the same for this special case.

edited Apr 13 '17 at 12:44

Community

asked Nov 21 '16 at 18:29

oW_

2

Setup validation and try both. I'd suspect the regression problem will do better. In general throwing away information via thresholding isn't good unless it somehow removes irrelevant noise. – Ryan Bressler Nov 29 '16 at 22:43

For a numerical outcome, is thresholding first and classification the same as regression and thresholding after the prediction?

0 Answers0