1

In my case, each active subject is matched to a variable number of control subject. The way I can think of is to first average across the control subjects for the same active subject to make a 1: 1 matched sample so that it is easy to get mean difference etc.

Is this the correct way? I am not getting the same results as the person who submitted the results for review.

The matching was not done using matchit(). Therefore, I cannot use its summary or cobalt to assess covariate balance.

hehe
  • 347
  • 2
  • 9

1 Answers1

2

cobalt can check balance on data not matched with MatchIt. If you have either the matching weights or pair membership, you can supply either of those to bal.tab(). If you have the matching weights, you would enter the following:

bal.tab(treat ~ x1 + x2 + x3, data = data, weights = matching_weights)

where treat is the treatment, x1, etc. are the covariates, data is the dataset containing all the units prior to matching, and matching_weights is a vector of matching weights. For variable k:1 matching, you construct matching weights as follows: unmatched units get a weight of 0, matched treated units get a weight of 1, and matched control units get a weight of 1 divided by the total number of control units matched to the same treated unit. So, if treated unit A was matched to controls units B and C, B and C would each get weights of 1/2. This is equivalent to what you were doing manually.

If you have pair membership, you should transform it into weights in order to be able to estimate the treatment effect using a weighted regression (e.g., as described in the MatchIt documentation). If you don't want to compute matching weights but have pair membership, you can supply pair membership to the match.strata argument of bal.tab(), which will automatically construct matching weights and use them to compute the balance statistics. The argument supplied to match.strata should be a factor with NA for unmatched units and a pair identifier for the matched units.

Remember that when computing the standardized mean differences, there are several ways to compute the standardization factor. For matching, it makes the most sense to use the standard deviation of the covariate in the treated group prior to matching. It may be that the results you are comparing yours to used a different formula.

Noah
  • 20,638
  • 2
  • 20
  • 58
  • This is very clear and helpful. I will do some exploration. Thank you so much! – hehe Mar 14 '21 at 00:59
  • It seems the CreateTableOne() function in R is also commonly used to show standardized mean difference, but I am not sure if it works for variable ratio matching. bal.tab() works great. – hehe Apr 02 '21 at 16:13
  • Can I ask you about the "diff.adj" in the output from bal.tab() using "match.strata" option? How is the adjusted SMD computed? Do you first obtain least square mean difference from a regression model adjusting for match strata and then standardize the LS mean difference? We want to make sure the description is accurate. thanks – hehe Apr 02 '21 at 19:06
  • 1
    `bal.tab()` computes matching weights from stratum membership. It does so by computing a propensity score as the proportion of treated units in each stratum and assigning that to each unit in the corresponding stratum, then puts that propensity score into the formula for computing weights from propensity scores for the given estimand. For k:1 matching for the ATT, control units are weighted proportional to the inverse of the number of control units in their stratum. Then those matching weights are used in the standard formulas. No models are fit. – Noah Apr 02 '21 at 19:38
  • I see. Thanks! so the "match.strata" approach will give the same results as the approach that specifies weights directly based on the number of matched controls. – hehe Apr 02 '21 at 20:05
  • By the way it seems supplying manually computed weights to bal.tab() does not work. It gave the following error. "Error: The argument supplied to 'weights' must be a named list of weights, names of variables containing weights in an available data set, or objects with a get.w() method." I intended to check if the answer matches that from "match.strata" – hehe Apr 02 '21 at 20:13
  • Is there a general agreement on good SMD? Some paper suggested absolute value within 0.25. I think this might be related to the usual caliper used for propensity score matching. – hehe Apr 03 '21 at 02:26
  • Are you using the most updated version of `cobalt`? I often update it with bug fixes. The version currently on CRAN is the most updated. 0.25 is way too big unless you're also adjusting for those covariates in the outcome model. I say .01 for strong predictors of the outcome, .05 for moderate, and .1 for weak. Any covariates with SMDs greater than .1 should be adjusted for in the outcome model (but it's a good idea to adjust for as many as you can anyway). – Noah Apr 03 '21 at 07:22
  • Yes, I used the most recent version 4.3.1. I posted my R code below in answer since it is difficult to post it in the comments section. – hehe Apr 03 '21 at 16:08
  • Thanks so much. I just realized 0.1 is considered bad... We did do an additional analysis adjusting for covariates in the outcome model. – hehe Apr 03 '21 at 16:15
  • 1
    `bal.tab()` doesn't use non-standard evaluation like `dplyr`. Use `weights = "weights"`. – Noah Apr 03 '21 at 22:08
  • Thank you so much!! It worked after I changed to ```weights = "weights"```, and the result matched that from using "match.strata" exactly. Feeling very happy now. Usually we are not required to put " " to call a variable inside a data frame for R functions. Therefore, I would not think of this if you had not told me. Actually using the procedures from "dplyr" returns a regular data frame, but it helps shorten the code. I recently started using it and found it convenient for data manipulation. – hehe Apr 03 '21 at 23:04