I am frequently measuring the effect of behavioral health treatment interventions on outcomes of interest. However, comparing the relative efficacy of different types of treatment is tricky - more intensive interventions may indicate clients with more severe issues, whose outcomes will be more frequently negative anyway. RCTs are generally unethical in the areas I'm studying.
What are your favorite approaches for addressing this sort of selection bias - where level of need determines intervention type, but level of need also plays a role in determining outcome? What are your critiques of common approaches?
Some of the approaches I've explored (note that when I say covariates indicating severity, there is no magic variable I have that shows "this treatment is what a person needs"; this is all based in theory and observed/available data, but are just likely indicators that have to be taken into account with other factors):
Multivariate models including covariates for severity of condition (e.g., primary diagnosis, history of emergency services, etc.);
Propensity-score matching, with the same factors predicting treatment type and outcome (but can only examine one treatment type at a time);
Latent class analysis (built off covariates that may indicate severity);
Only running models on tightly-defined groups (e.g., only on people with one specific type of diagnosis).