I'm using logistic regression to perform binary classification with training, CV, and test sets. When is the most appropriate time to pick a discrimination threshold to balance positive and negative error rates? Should I use the CV set to determine a desired threshold, and then apply the fixed threshold to the test set to assess classification performance? Or should I determine the threshold using only the test set?
Asked
Active
Viewed 595 times
2
-
Why do you want to build a classifier? See here: https://stats.stackexchange.com/questions/130420/logistic-regression-how-good-is-my-model The classification of a logistic regression is better measured by different scoring rules (like, for example, the c-statistics, also called AUROC). – Federico Tedeschi Jul 06 '17 at 14:26