I set up a small project using a random forest to estimate the treatment effect between a control and action group and used IPTW the get propensity scores to make the two groups comparable. Using a random forest has the advantage that it gives us an estimated treatment effect per observation that might vary based on the covariates (in contrast to a linear model where this would mean adding a lot of interaction variables).
Now I would like to extend this to three possible actions and one control group. So I can estimate what the most optimal action is to take for new observations.
However, as far as I know, propensity scores estimate the probability that an observation was part of the action group, so we can make the covariates of the action and control group similar. How do I extend this to a situation where I want the four groups to have similar covariates so we can safely compare the treatment effect of the three action groups?
EDIT for the bounty: After reading some of the papers in this field I want to re-phrase my goal: I would like to estimate the treatment effect for a categorical treatment variable. In much of the literature, a binary treatment variable is used. The comment of Dimitriy V. Masterov points me in the right direction. Still, I would like a more layman explanation of how to do weighting based on propensity scores when there are multiple actions.