Re-randomization in Experimental Design

Question

I am struggling to synthesize the objections to re-randomization in experiments when it is costless and the criteria are pre-specified and regression is not an option. This came up in a related question in the comments on gung's answer and references to the work of Stephen Senn, like this paper.

Re-randomization is defined as:

I believe the usual objections are:

Rerandomization to balance baseline covariates and outcomes results in overly “conservative” inferences in the sense that tests will reject true null hypotheses less often than the nominal level and confidence intervals will cover the true value more often than the nominal level.
Baseline covariates/outcomes don't tell us anything about imbalance in unobserved covariates in the sample data we will use for analysis.
We care about the population, but imbalance is inherently a property of the sample.
You can adjust for imbalance in observed covariates and lagged outcomes in a regression anyway.

On the other hand, there are papers like Morgan and Rubin (2012) that do endorse the practice, it seems a common in industry (not that the latter is proof of anything), and it also is fairly common in economics research (where researchers often do not have control of the randomization process, so there could be some need to convince your readers that treated units were not cherry-picked). In my work, randomization is often at group level, and unbalanced groups are common when a finite subset of the population is in the experiment and groups vary in size.

I can see why (1) is a concern, but I believe this could be handled with randomization inference. I also don't mind being somewhat conservative in settings where there often are no adjustments for multiple comparisons.

With (2), balance on observed covariates and lagged outcomes makes me more willing to believe there is balance on unobserved relevant covariates since the two are usually correlated. It's like a false placebo test.

On (3), I agree, but if we make inferences about the population by using the sample, so this seems undesirable.

(4) is ruled out by the assumption of being unable to do regression in the experimentation platform and wariness about letting people throw in controls thoughtlessly.

I am curious if any of this is foolish or misguided.

(+1) I saw this came up recently in another question, though I can't sem to find it now, :( It's an interesting question. Med stats and epi seem quite dogmatic in applying the Senn (and others like Altman) way of thinking, perhaps because they always use a regression model and don't believe that randomisation can fail. I suppose a lot will depend on what kind of final analysis model you are uising, if not regression. Also, what field are you working in (econometrics?). Perhaps another objection is that the criteria for determining whether randomisation is successful could be deemed arbitrary. — Robert Long, Aug 27 '20 at 08:56
I am an applied economist working in tech. The final analysis would be done using a t-test/z-test (or maybe randomization inference), and the criteria for re-randomization would be rejecting the null that all T-C diffs for the main pre-specified outcomes and covariates are the same. — dimitriy, Aug 27 '20 at 21:21
You introduced the new tag *randomization*. Can you please write a tag wiki? — kjetil b halvorsen, Sep 02 '20 at 23:57

xiaoA · Answer 1 · 2022-01-29T10:09:55.933

I am also researching related methods and have some related practical applications in a tech company. Here are some of my thoughts, I hope this can be helpful.

re-randomization is an extension of blocking when we have many covariates, and it will be helpful to provide more precise and trustworthy estimates of treatment effects.
If I understand correctly: I don't think Stephen Senn is against rerandomization/block/other methods that can improve balance. This is one of his Philosophy of Clinical Trials

If you can balance what is important so much the better, but fitting is more important than balancing

rerandomization need covariate data are available before units are exposed to treatments, this is rarely the case for AB test in a tech company. In most cases, You don't know which users will join the test. Some will turn to another option: find the best hash seed or salt. I don't think this method will work: It's just trying to balance unrelated users.

Re-randomization in Experimental Design

1 Answers1