26

It's often stated that permutation tests have no assumptions, however this is certainly not true. For example if my samples are somehow correlated, I can imagine that permuting their labels would not be the correct thing to do. Only think I found about this problem is this sentence from Wikipedia: "An important assumption behind a permutation test is that the observations are exchangeable under the null hypothesis." Which I don't understand.

What are the assumptions of permutation tests? And how are these assumptions connected to different possible permutation schemes?

Alexis
  • 26,219
  • 5
  • 78
  • 131
rep_ho
  • 6,036
  • 1
  • 22
  • 44
  • 4
    (+1) The Wikipedia quotation, although correct, is kind of funny, because when you get through the (obscuring) technical jargon it comes down to saying you should permute exactly those observations that you assume you can permute. – whuber Dec 17 '14 at 16:55
  • 1
    Difficult to answer since there are so many different permutation tests. E.g. for a k-sample comparison, heteroscedasticity between groups would violate exchangeability assumption. – Michael M Dec 17 '14 at 17:05
  • 1
    (+1) Based on Rubin (2015) "Causal Inference for Statistics", when the label (or treatment) is independent of the potential outcomes, then you can use permutation test.The logic is for each subject, it has two potential outcomes under label A and B, they are fixed. The label assignment procedure is random and if it is independent of the potential outcome, then you can think of performing this label assignment procedure N times, N is the total number of permutations of the labels, then you can get a distribution of the statistic you care about. Then you check quantile of the observed quantity – KevinKim Jan 13 '17 at 17:02
  • See "A Primer on Quantitized Data Analysis and Permutation Testing" (page 88). – Davester Dec 17 '14 at 17:03
  • 1
    [Exchangeability](https://en.wikipedia.org/wiki/Exchangeable_random_variables). Also: [[tag:exchangeability]]. – Alexis May 28 '20 at 18:45

1 Answers1

22

The literature distinguishes between two types of permutations tests: (1) the randomization test is the permutation test where exchangeability is satisfied by random assignment of experimental units to conditions; (2) the permutation test is the exact same test but applied to a situation where other assumptions (i.e., other than random assignment) are needed to justify exchangeability.

Some references regarding the naming conventions (i.e., randomization vs permutation): Kempthorne & Doerfler, Biometrika, 1969; Edgington & Onghena, Randomization Tests, 4th Ed., 2007]

For assumptions, the randomization test (i.e., Fisher's randomization test for experimental data) only requires what Donald Rubin refers to as the stable unit treatment value assumption (SUTVA). See Rubin's 1980 comment on Basu's paper in JASA. SUTVA is also one of the fundamental assumptions (along with strong ignorability) for causal inference under the Neyman-Rubin potential outcomes model (cf. Paul Holland's 1986 JASA paper). Essentially, SUTVA says that there is no interference between units and that the treatment conditions are the same for all recipients. More formally, SUTVA assumes independence between the potential outcomes and the assignment mechanism.

Consider the two-sample problem with participants randomly assigned to a control group or a treatment group. SUTVA would be violated if, for example, two study participants were acquainted and the assignment status of one of them exerted some influence on the outcome of the other. This is what is meant by no interference between units.

The above discussion applies to the randomization test wherein participants were randomly assigned to groups. In the context of a permutation test, SUTVA is also necessary, but it may not rest on the randomization because there was none.

In the absence of random assignment, the validity of permutation tests may rely on distributional assumptions like identical shape of distribution or symmetric distributions (depending on the test) to satisfy exchangeability (see Box and Anderson, JRSSB, 1955]).

In an interesting paper, Hayes, Psych Methods, 1996, shows through simulation how Type I error rates may become inflated if permutation tests are used with non-randomized data.

bsbk
  • 1,127
  • 11
  • 19