I'm looking for references on the use of zero-imputation with dummy-variable augmentation in the context of predictive models and MNAR missingness. Basically, the idea is that one imputes zero for any missing datum, and adds a column to the design matrix for each variable that has been imputed like so. The idea is that the average effect of the missingness mechanism is picked up by the missingness variable, and that no signal is transmitted by the zero-inputed missing value.
I'm curious how this works in tree-based methods (I imagine that it doesn't?), in penalized regression, and in neural nets. This method has the obvious appeal of being automatic and low-cost in the context of algorithms that are robust to large numbers of variables (if it works).
I'm aware that this creates biased coefficients in the context of inferential statistics.