I would like to use a greedy nearest neighbour method to do propensity score matching. Though I've little experience here, it seems that the distance measure used is generally a propensity score generated from a logistic regression. My question is: why logistic regression? Why not a random forest, SVM or another method? Is there some logic to suggest this would be unfruitful?
My plan is currently to use the MatchIt
package in R and input my own distance measure calculated off the back of a random forest (you can input your own propensity score into the argument distance
of the matchit
function).
I'm deterred by the fact that modern propensity scoring packages such as
PSAboot don't have built in facility to do this for nearest neighbour methods. They do use party
and rpart
but only to match using strata (additionally they only use single trees).
I'm intrigued that methods with (typically) greater predictive power than logistic regression do not appear to be harnessed to create a 'better' propensity score for one-to-one matching. Is there someone out there that can shed some light on this?
Linked question: Propensity Score