3

I want to make my classifier prioritise finding true cases (1) even if that means that a lot of the false cases (0) are also classified as true.

Specifically I wish to find the weights to my features that gives at least 97 % of my true cases as true and as a small fraction as possible of my false cases as false.

What should I be looking for in a logistic regression setting? Of course I can just adjust the intercept but I assume that there must be some better way?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
fixingstuff
  • 71
  • 1
  • 1
  • 4
  • 1
    sometimes questions get overlooked, or take some time to get noticed. This is a very good question and I've upvoted it, but you might need to wait a few days for an answer – shadowtalker Jan 12 '15 at 05:14
  • 1
    You have started off your project with multiple misconceptions. See for example http://www.fharrell.com/2017/01/classification-vs-prediction.html – Frank Harrell Jun 11 '17 at 11:56

1 Answers1

2

First, logistic regression is not a classifier, it is about probability estimation. Unbalanced classes per se is not a problem, unless there simple is not enough data in the class with few observations. See for instance Does an unbalanced sample matter when doing logistic regression? for more information.

You do not need to make any changes in the estimation procedure, you simply change the loss function used for classification afterwards, if you need a classification. That is, separate the estimation problem from using the estimated model to make decisions.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467