IPTW propensity scores for multiple actions

Question

I set up a small project using a random forest to estimate the treatment effect between a control and action group and used IPTW the get propensity scores to make the two groups comparable. Using a random forest has the advantage that it gives us an estimated treatment effect per observation that might vary based on the covariates (in contrast to a linear model where this would mean adding a lot of interaction variables).

Now I would like to extend this to three possible actions and one control group. So I can estimate what the most optimal action is to take for new observations.

However, as far as I know, propensity scores estimate the probability that an observation was part of the action group, so we can make the covariates of the action and control group similar. How do I extend this to a situation where I want the four groups to have similar covariates so we can safely compare the treatment effect of the three action groups?

EDIT for the bounty: After reading some of the papers in this field I want to re-phrase my goal: I would like to estimate the treatment effect for a categorical treatment variable. In much of the literature, a binary treatment variable is used. The comment of Dimitriy V. Masterov points me in the right direction. Still, I would like a more layman explanation of how to do weighting based on propensity scores when there are multiple actions.

Take an look at the manual for Stata's [MVT IPW command](https://www.stata.com/manuals/teteffectsmultivalued.pdf). It has a nice discussion with references to the relevant literature. — dimitriy, Feb 28 '18 at 18:34
What is you query of interest, the causal effect of a joint intervention of treatments say $X_1 = 1$, $X_2 = 1$ and $X_3 =1$ versus $X_1 = 0$, $X_2 = 0$ and $X_3 =0$? — Carlos Cinelli, Mar 01 '18 at 06:34
Or do you want to estimate each effect separately, and compare which one is the most effective? — Carlos Cinelli, Mar 01 '18 at 06:36
@CarlosCinelli Query of interest is $X_1 = 1, X_2 = 0, X_3 = 0$ versus $X_1 = 0, X_2 = 1, X_3 = 0$ versus $X_1 = 0, X_2 = 0, X_3 = 1$. Or in simple terms: Out of the three known action, what is the most effective action for this person? — Peter Smit, Mar 01 '18 at 08:47
I found this question essentially asking the same: https://stats.stackexchange.com/questions/202698/comparing-two-or-more-treatments-with-inverse-probablity-of-treatment-weighting — Peter Smit, Mar 05 '18 at 14:27
This tutorial seems to do what I'm interested in. Compute a weight per observation so that all covariates are balanced over multiple treatment options: https://rpubs.com/kaz_yos/matching-weights — Peter Smit, Mar 05 '18 at 15:22

Carlos Cinelli · Accepted Answer · 2018-03-11T03:38:01.663

In the theoretical/identification level, there's no difference in the IPTW for binary, multiple valued or even continuous treatment. IPTW is simply an estimation procedure that follows directly from the backdoor causal estimand. That is, let $Y$ be your outcome, $X$ your treatment and $C$ be the set of confounders satisfying the backdoor criterion. The estimand for the causal effect of $X$ on $Y$ is then:

$$ P(y|do(x)) = \sum_{c}P(y|x, c)P(c) $$

By Bayes' rule, we can rewrite it as:

$$ P(y|do(x)) = \sum_{c}\frac{P(y,x,c)}{P(x|c)p(c)}P(c) = \sum_{c}\frac{P(y, x, c)}{p(x|c)} $$

Thus, you can see that we need to weight the joint probability $P(y, x, c)$ with the inverse of probability of treatment actually received $\frac{1}{p(x|c)}$. If $X$ is multivalued but discrete, $p(x|c)$ is still a probability mass function, so you simply estimate a model for $p(x|c)$ and predict this probability for the treatment actually received for all units --- the inverse of this will be your weights. If $X$ is continuous, the logic is the same, but now $p(x|c)$ is a density.

To sum up, you need to: (i) posit a model for $p(x|c)$, in your case a model that accommodates categorical variables; (ii) compute $w_i = 1/\hat{p}(x_i|c_i)$ for each observation, where $\hat{p}(x_i|c_i)$ is the predicted probability of the treatment actually received for that unit, given its covariates; (iii) compute treatment effects (contrasts of the outcome between categories of $x$, say $x_1$ vs $x_3$) using the weighted sample.

IPTW propensity scores for multiple actions

1 Answers1