It seems to me that despite the huge variety and development of ML methods (I'm specifically interested in regression methods), OLS is still considered, and often cited, as a benchmark - which makes sense at least because of its easy interpretation.
Despite this, I failed to find regression methods that can be thought to generalize a multivariate OLS, in the sense of being able to identify linear relations between features. I find this surprising because in many real world examples (I work with economic data) linear relations coexist with nonlinearities, and because... in principle, it does not seem like a difficult task.
For instance, the value predicted by a regression tree for an observation only depends on the leaf it ends into. Why not fit a linear regression on each leaf instead, and use the resulting (leaf-specific) coefficients for the prediction? Yes, the cost would (if I understand correctly) dominate the cost of building the tree... but on the other hand, it would still be asymptotically equivalent to the cost of an OLS (with a lower memory footprint). And as long as the tree is not too deep (that is: each leaf still contains enough observations), this would for instance perfectly predict a pure linear relation. It would be a sort of multi-dimensional automatic piecewise linear regression.
Or vice-versa, it would be tempting to run a regression tree on OLS residuals. If done by using wisely the (example from sklearn) min_impurity_decrease
argument, this would, it seems to me, reliably dominate OLS in terms of explanatory power, at least for large samples (with the two coinciding in the case of a perfectly linear relation).
Is there a reason why techniques of this kind are not widespread? Or are they?