5

Having used randomForest in R to produce a fairly successful classifier is there any way to emphasise sensitivity over specificity, for example, if the cost of missing a disease is much greater than diagnosing a false positive?

  • Looks like the bst package in R will accept weights for false positive and false negatives. I'm going to leave the question open though, It might be useful to others. –  Nov 25 '11 at 15:42
  • 1
    http://www.biomedcentral.com/content/pdf/1471-2105-10-S1-S22.pdf Using random forest for reliable classification and cost-sensitive learning for medical diagnosis –  Nov 25 '11 at 16:01

1 Answers1

3

I don't actually know R package, but if you were able to decrease the cost of False Positives it should work. This is also the rationale of MetaCost by Domingos, that is implemented in WEKA. However, increasing the sensitivity you are going to decrease the specificity, naturally. From the information retrieval point of view, as long as you increase the recall the precision will decrease.

Because Random Forest use Decision Trees as base classifiers and they can output probabilities, you can decrease the cut-off that enable a tree to classify a record as positive. This will make you Random Forest more sensitive but less precise.

Simone
  • 6,513
  • 2
  • 26
  • 52
  • 1
    Sure. I'll add that in case of randomForest, OOB votes are in `votes` element of the `randomForest` object; for prediction, one must use `predict` with `type="votes"` or `type="prob"`. –  Nov 25 '11 at 22:41
  • 1
    ...and here's and example on how to do that http://stackoverflow.com/questions/31130053/roc-for-random-forest/31149340?noredirect=1#comment50311605_31149340 – Soren Havelund Welling Jul 01 '15 at 23:56
  • Also another option is to use the weighted Gini gain http://stats.stackexchange.com/a/69335/2719 – Simone May 22 '16 at 10:01