2

I have three groups that I would like to match on age and gender. They don't need to be matched one-to-one, I just need group differences p < .05 for both variables. I can find packages that will match TWO groups, but not more than that. Does such a package or code snippet exist?

Secondarily, because the available matching packages only work for two groups, the treatment parameter is required as a logical vector, whereas right now mine are factors. What is the most efficient way to convert a factor to a logical when there are only 2 levels?

Muchos gracias!

  • 1
    Have you considered using propensity scores as a covariate? I don't know if matching between 3 groups is available; it sounds like it would be very hard. – Peter Flom Jan 07 '13 at 23:13
  • This may be helpful for propensity scores: http://stats.stackexchange.com/questions/8604/how-are-propensity-scores-different-from-adding-covariates-in-a-regression-and – Stephan Kolassa Jan 08 '13 at 07:56
  • I haven't read past the abstract myself, but this paper purports to match three groups on a covariate (not on two covariates as you need, but it may be a start): http://www-stat.wharton.upenn.edu/~rosenbap/match2control.pdf – Stephan Kolassa Jan 08 '13 at 07:59
  • ... continuing from my previous comment: so you could first divide the males into three groups as in the paper, then the females - one advantage of one of the covariates being a factor. – Stephan Kolassa Jan 08 '13 at 08:28
  • What is the reason for needing to match as opposed to using covariate adjustment? Matching usually results in some degree of arbitrariness plus residual confounding. Even worse would be if the matching algorithm discarded any data. – Frank Harrell Aug 06 '13 at 12:05
  • A similar but more general question came up in [MathematicaSE](http://mathematica.stackexchange.com/questions/31469/how-to-partition-a-list-to-make-each-subsets-size-equal-and-mean-as-close-as-po). If Stephan's suggestion doesn't work out then you might try mine there -- yes, it would be overkill -- using just one start, doing the males and females separately. – Ray Koopman Sep 05 '13 at 21:28
  • It is a common conceptual error to take the p-value of a test of mean difference as an indicator of goodness of matching. (I assume you meant p > .05, not p < .05.) Some measure of effect size, that does not depend on the sample size, should be used. Strictly speaking, you want the entire distributions to be as similar as possible. Looking at the means is only a first step; you should consider also at least the standard deviations. And to convince a skeptic, you might need to show that the regressions of the dependent variable on the covariate were similar across groups. – Ray Koopman Sep 05 '13 at 23:06

1 Answers1

1

A simple first try would be this: order the males in ascending order of age. Assign the first three males at random to groups A, B, C. Assign the second three males at random to groups A, B, C. And so on. Then do the same for females. The randomization in each three-person group makes sure that group A doesn't always get the youngest of each three-person group.

This may already yield a matching with low effort without having to invest a lot of time in finding a sophisticated local search procedure.

Stephan Kolassa
  • 95,027
  • 13
  • 197
  • 357