2

Question

I read the post that describes the difference between classification vs. prediction. The main takeaway is that sometimes we prefer algorithms that output probabilistic estimate rather than deterministic 1/0 decision. This makes us determine a decision threshold that is best suitable for downstream task.

However, almost all machine learning courses I have taken do not address this point well. This makes me confuse

  • Why machine learning community does not seem to care about this.
  • Is there a list that categorizes algorithms that naturally output probability estimate. To my knowledge, Logistic regression, boosting, and decision trees do this while SVM does not. But I am expecting a more complete list.
Mr.Robot
  • 177
  • 1
  • 11
  • I think you have a misunderstanding about how the methods you mention work. For example, in logistic regression, the output is a predicted class probability $p$, not a $0$ or $1$. Only after setting a boundary for the decision can you turn those estimates into zeroes and ones. See [here](https://stats.stackexchange.com/a/208872/176202) for example. Taking another of your mentioned methods: Decision trees (can) have class probabilities in their terminal nodes. – Frans Rodenburg Jan 16 '20 at 20:32
  • 2
    My own opinion: Machine learning and statistics are not distinct fields of research. A good machine learning course should address the same fundamentals as a biostatistics course. Those who feel the distinction *is* necessary often say machine learning is more of an algorithmic approach whereas statistics is a mathematical one. This might explain why in practice, (poor) statistical courses focus too little on state-of-the-art algorithms, whereas (poor) machine learning courses pay too little attention to the underlying theory. – Frans Rodenburg Jan 16 '20 at 20:40
  • 1
    "Why machine learning community does not seem to care about this" I agree that this point is often not explained well, but it's not true that the machine learning community doesn't care about this. What you're describing is a basic idea that is described clearly in popular textbooks, and which pretty much everyone who is well versed in machine learning understands. Maybe you could give an example of a book that you think doesn't explain this well enough. By the way, you can add neural networks to your list of models that output probability estimates. (Logistic regression is a special case.) – littleO Jan 16 '20 at 20:40
  • 1
    @littleO I read a lot of papers that use Logistic regression as a baseline (and these papers appear on good venues like KDD and NeurIPS). However, almost all of them use 0.5 as a default threshold. This gives me a feeling that they do not care about choosing a good threshold. – Mr.Robot Jan 16 '20 at 20:45
  • @FransRodenburg Yes. That is why taking machine learning courses from CS and statistics department gives me very different feelings. – Mr.Robot Jan 16 '20 at 20:49

0 Answers0