Recently, I have been studying causal inference and have come to a bit of a crossroads with respect to making decisions based on the analysis of data (especially in a business/industry setting). Specifically, I am referring to common problems like "churn modelling", segmentation, and lifetime value problems where the goal is to figure out specific demographics to "target" to increase revenue or to decrease churn, etc.
Often, I see these problems solved in the following way (whether good or bad). Take a bunch of predictors that are plausibly associated in some way with the outcome variable (whether that is churn, lifetime value, or some other profitability metric) and then fit a machine learning model to the problem (using the standard test sets/data splitting, etc.). Then, look at the feature importances of the best predictive model (perhaps using a method that corrects for multicollinearity, like SHAP scores) and determine the most impactful features, from which we understand as the most predictive variables. We can make then decisions on who to target, market to, etc. based on these influential variables.
Now, we know that none of this is causal in any way since we are just exploiting correlations. We didn't consider the actual causal structure of the problem, draw out DAG's like Pearl suggests, and condition on sufficient adjustment sets to derive causal effects (and ultimately see the impact of "treatments"). Through careful handling of causality, we can deal with issues that may arise from the above approach like Simpson's paradox, for example.
My question is as follows: is the first method of modelling, and ultimately, the business decisions made from the first method, incorrect or dangerous? Equivalently, is absolute causality needed in this setting? I can see why this may be the case - but in say a huge dataset with many predictors and proper regularization, I have a tough time believing that the ML approach would lead to outright bad decisions (though perhaps not quite as strong conclusions). In addition, I think many would agree that the first method is less time-consuming. Writing out a causal model is difficult, especially when there is a lack of expertise.