I have a general question about asymmetric costs. In machine learning problems, there are times when the cost of a false positive is different from the cost of a false negative. Accordingly, models should be built differently to account for this asymmetry in costs.
How is this done for a random forest?
Some possible ways are:
- Changing the information gain calculated when considering different splits in a given branch of a decision tree to account for asymmetry
- Adjusting the threshold from 0.5 within each leaf when assigning the predicted label of a positive class in a given decision tree
- Adjusting the threshold from 0.5 within the collection of decision trees when "voting" on the predicted label for the random forest
- Using ROC curves and choosing a different threshold than what is typically chosen (typically, the threshold closest to the top-left corner of the ROC graph is chosen as the "ideal")
Which of these way(s) are implemented to account for asymmetric costs, in practice?