How to define acceptable false negative/positive rate? Any research you could recommend?

Question

Are there any case studies or research that goes through the process one might take to define acceptable false negative/false positive rates for classification? Say we have a delivery robot and we are trying to classify pedestrians so we don't run them over, just trying to get an intuition on how to approach setting some acceptable rates.

There is no universal criterion. It's a decision to be taken depending on the case you're working on. You may be interested in https://en.wikipedia.org/wiki/Precision_and_recall — David, Jun 17 '19 at 11:31
Don't use FNR/FPR as KPIs. See [Why is accuracy not the best measure for assessing classification models?](https://stats.stackexchange.com/q/312780/1352) and [Is accuracy an improper scoring rule in a binary classification setting?](https://stats.stackexchange.com/q/359909/1352) and [Classification probability threshold](https://stats.stackexchange.com/q/312119/1352) — Stephan Kolassa, Jun 17 '19 at 18:15

score 3 · Answer 1 · answered Jun 17 '19 at 22:59

In determining appropriate false positive rates and false negative rates, you should consider cost. Traditionally one may determine cost using two factors:

1) The cost of a false positive and false negative.

2) The expected number of positive and negative instances in your population.

An example may help. Assume a population of 100 people, 80 who do not have a disease and 20 who do. Assume also you have a test that detects individuals who have a disease with a false positive rate of 10% and a false negative rate of 20%. Assume also the costs of a false positive are 5 dollars and a false negative are 10 dollars. To calculate the expected cost:

Cost = FPR * Cost of FP * Total Number of Positives + FNR * Cost of FN * Total Number of Negatives

Cost = 0.1 x 5 x 80 + 0.2 x 10 x 20 = 40 + 40 = 80

As the example demonstrates, changing the number of people who have/don't have the disease, or changing the cost of a false positive or false negative, can change the total net cost (and the costs relative to one another).

Hope this helps.

score 0 · Answer 2 · answered Jun 17 '19 at 16:36

The question is very general and depends on the nature of your problem and what you're trying to achieve. From your description of it being about critical safety, it's probably the case that false negatives are more dangerous than false positives. If that's the case, then may want to prioritise lowering your false negatives, without lowering your total accuracy too much.

score 0 · Answer 3 · answered Aug 25 '20 at 10:17

There is no such a thing as "generally acceptable" thresholds for those metrics (same is true for any other metrics as well). Let me give two examples:

Imagine that you are building a killer robot for guarding your house. You want false positive rate to be 0%, since otherwise the robot will literally kill you.
On another hand, if your task is finding the right audience for pop-up marketing notifications, than finding the threshold is about balancing between losses due to showing unwanted ads the customers, vs gains due to broadening your ad audience.

Both scenarios dramatically differ in what is "acceptable" false positive threshold.

score 0 · Answer 4 · answered Aug 25 '20 at 10:19

It really depends on your problem! In the typical problem of using machine learning to determine if a tumor is malign: would you trust a model that would have a high number of false negatives? Probably in that case, you would prefer to have false positives. However, you would still need a very high accuracy: you wouldn't want to tell a patient they had cancer, when they didn't, maybe you would need an accuracy of 90/95% or higher.

However, in some other model with other outputs, maybe you would prefer to have a higher rate of false negatives; in some models, you could even be satisfied with a lower accuracy. It all depends on your data, and most importantly, on your objective!

How to define acceptable false negative/positive rate? Any research you could recommend?

4 Answers4