Start with the The Two Cultures: statistics vs. machine learning? thread. Machine learning is about finding patterns or correlations in data. Causal inference, like statistics, is about inference. As others already noticed in the comments, those are different problems. When your aim is to study if smoking causes cancer, your aim is not classification accuracy, but confirming or rejecting the existence of the causal relationship. On another hand, if you want to accurately predict that someone will get cancer, you may throw many different variables that directly or indirectly relate to cancer to maximize the predictive performance. Sure, causal reasoning could and should inspire your decisions on which features to consider, but you wouldn't make the prediction based only on the fact that somebody smokes cigarettes because there is a causal relationship. Keep in mind that there could be multiple causal relationships (many things cause cancer), we may not be able to measure all of them, the causal relationship also doesn't mean that something is certain (not everyone who smokes gets cancer), they can be of varying strength, and the data would also be noisy, so having the features that are causally related does not give you classification performance guarantees.
Answering your question with a metaphor, using a causal model for classification is like if your task was to transport goods from A to B and you approached it with designing your own lorry. Sure, the custom lorry may be faster and more efficient for your problem, but is it worth it? Same with causal inference, if you used it, it would mean that you need to spend at least twice as much time on the problem since you would be solving two problems (a) causal inference, (b) classification. It is also non-trivial how would you inject the causal knowledge into the classification model, so this might be a third research problem to solve. If you are working on a high-stakes problem (e.g. medicine) and have a budget that allows you for doing the research, sure, it might be worth it. But in most cases, if your aim is only to do classification, it would be enough for you to have a machine learning model that would find by itself an approximate representation of the data that is good enough for classification.