I am building a Logistic regression model for a churn problem. When I scored the out of sample data set, I find very low probability levels as the output probability. Conventionally, I would look for .5 as the cut off but this scored population doesnt have many customers above .5 ( say just 1%). Seeing the business cause, we need atleast 5% people to be approached for the impact.
I therefore reduced the cut off probability to judge the scored dataset. So now, I am defining .1 probability as the cut off. The model is very good at that level, in that it is perfectly distinguishing my target from non target.
- Is there any problem with this approach, given that at .1 level, model has very good accuracy.
- what in general is the cause of low probabilities at scored population level.