I am new to the fascinating world of matching and propensity scoring. It is highly likely that I will be using some (or more) matching method(s) for my forthcoming project, probably with the R package MatchIt
.
I am particularly interested in finding out whether there is a method that makes it possible to interpret the regression coefficients of covariates following matching. In math terms, say you have an outcome variable Y
, a treatment variable A
and a couple of covariates X1
and X2
. So you first match with a model that looks something like this: A ~ X1 + X2
(I'm using the R language notation here) and then use the matched dataset to run a regression like Y ~ A + X1 + X2
. I would like to be able to interpret the coefficients of this model for X1
and X2
.
The relevant peer-reviewed work (Ho et al., 2007, 2011) and vignettes (Greifer, 2022) have confused me a bit. It looks like the school of thought represented by Ho et al. is that matching can be used for preprocessing data before applying some model with Y
as the outcome and A
, X1
and X2
as the independent variables (like the model above). To me, this is like saying that, after matching the data, I can use whatever model, e.g. linear regression, GLM, GAM, Random Forest etc. to interpret the relationship of Y
with X1
and X2
through e.g. regression coefficients, partial plots etc. (See also How exactly to evaluate Treatment effect after Matching? and Regression after matching
.)
On the other hand, Greifer (2022) states, in many places, that one should not interpret the covariate coefficients:
It is important not to interpret the coefficients and tests of the other covariates in the outcome model. These are not causal effects and their estimates may be severely confounded. Only the treatment effect estimate can be interpreted as causal assuming the relevant assumptions about unconfoundedness are met. Inappropriately interpreting the coefficients of covariates in the outcome model is known as the Table 2 fallacy [...]
So it is not clear to me if I can or cannot interpret the coefficients of the covariates following matching. I am probably missing something, so some clarity would be highly appreciated!
References Greifer N (2022). Estimating Effects After Matching. https://kosukeimai.github.io/MatchIt/articles/estimating-effects.html#after-pair-matching-without-replacement [accessed 11 Jan 2022]
Ho DE, Imai K, King G & Stuart EA (2007). Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference. Political Analysis 15:199–236. doi:10.1093/pan/mpl013
Ho DE, Imai K, King G & Stuart EA (2011). MatchIt: Nonparametric Preprocessing for Parametric Causal Inference. Journal of Statistical Software 42(8).