2

I have a classification problem (A or B or C). I am currently evaluating test set results from the trained random forest, neural net, and logistic regression models.

Any one model works pretty well on test set when I require an A or a B call to exceed some probability threshold, for instance 80%. Otherwise it's C.

I was a little surprised that averaging and majority vote based on these three models didn't improve the performance over NN alone.

The only thing that did help, substantially, was if I let any one model to be sufficient for an A or B classification.

While it appears to work, I'm unsure about this in as much as everything I read tends to suggest majority vote or averaging, and this one trigger as sufficient never seems to be mentioned.

Interested in people's thoughts on this before I start working on stacking methods, perhaps unnecessarily. Thank you!

steve d
  • 177
  • 1
  • 1
  • 7
  • Maybe the conclusions of Dormann, Carsten F., et al. "Model averaging in ecology: a review of Bayesian, information‐theoretic, and tactical approaches for predictive inference." Ecological Monographs 88.4 (2018): 485-504 are of interest in this context https://esajournals.onlinelibrary.wiley.com/doi/full/10.1002/ecm.1309 – Florian Hartig Mar 14 '19 at 16:38

0 Answers0