Model Selection in Propensity Score Matching

Question

I am trying to fit a logistic model to create propensity scores. Looking though the literature, there appears to be some disagreement on which covariates to include when designing such a model. Some say that all covariates that affect both treatment group and outcome should be included. Others advocate including only variables that predict treatment assignment, etc.

When choosing a model, what is our primary goal? Are we most interested in predicting assignment to a treatment group? Or are we most interested in balancing our sample on all covariates?

If we are most interested in prediction, I would think that it might be preferable to go through some model fitting process and to be mindful over over-fitting. However, I often see an emphasis on including all covariates as opposed to creating a model with better out-of-sample prediction.

[This](http://stats.stackexchange.com/questions/1194/practical-thoughts-on-explanatory-vs-predictive-modeling) is a relevant thread about explanatory versus predictive modelling. — Richard Hardy, Nov 03 '15 at 18:59

StatsStudent · Accepted Answer · 2015-11-03T19:22:38.973

Ideally, you'd want to select as many covariates as possible to be included in your propensity score (PS) model in order to predict the probability of treatment assignment, but the problem with that approach is that you risk having some of those covariates being influenced by the treatment itself and this violates the ignorability assumption of propensity scores: that the treatment assignment and response are conditionally independent given the covariates.

In addition, you risk including covariates that are not predictive of the treatment at all, in which case you will not reduce selection bias, but you will be adding noise, increasing the variance of the estimated treatment effect.

You do not want to focus on the fit of the model or on the significance of parameters in your model, but on whether or not your resulting model is successful in covariate balance.

If you do plan to use propensity scores, I'd suggest you look into gradient boosting methods for obtaining the probability of treatment assignment given the covariates. Research on this method suggests it typically performs better than simply logistic regression models.

Lastly, if you are using R, I'd recommend taking a look at the twang package. It is an excellent packages for obtaining propensity scores and more importantly assessing how well they achieve covariate balance. For small to medium sized datasets, I think it is the best software around and I've spent a lot of time working with similar packages in SAS, STATA, and R.

I'm not sure how to explain this theoretically, but empirically I've seen data from randomized experiments (LaLonde) wherein a covariate that is significantly different between control and treatment before matching actually reduces overall balance after matching when included in a propensity score equation. It can't be influence from the (randomized) assignment, but perhaps something to do with its relationship with other balancing covariates? — Hack-R, Jul 01 '16 at 13:05

score 1 · Answer 2 · answered Nov 03 '15 at 19:46

Coincidentally, the role of choosing variables for a propensity score is no different than selecting variables for a confounding relationship. Causally, the predictor must be predictive of treatment and causal of the outcome.

If a variable is predictive of treatment but independent of the outcome, then adding it to a propensity score model will affect the score predictions, but will do very little to affect the estimated relationship between the exposure and outcome. Rather, the risks of overfitted models are more likely to creep into play and affect the quality of your propensity score.

A more serious concern is if the variable has a reverse causal relationship with the outcome. More common in longitudinal studies where marginal structural models should be used. For instance, if a patient is assigned hypertensive meds because they had diagnosed hypertension, the hypertension may have been a function of their left ventricular mass. If the latter is the outcome of interest, you have introduced reverse causality into the model.

So I think it is fair to say, the principles of selecting propensity score factors for matching / weighting / adjustment are no different than that of selecting confounders.

Model Selection in Propensity Score Matching

2 Answers2