6

I have a process with binary output. Is there a standard way to test if it is a Bernoulli process? The problem translates to checking if every trial is independent of the previous trials.

I have observed some processes where a result "sticks" for a number of trials.

whuber
  • 281,159
  • 54
  • 637
  • 1,101
Penz
  • 235
  • 3
  • 8

2 Answers2

8

I think you can devise a big number of tests, and the good choice depends on the alternate hypothesis you have in mind... I just have a few remarks. I place this post under community wiki as I feel it can be improved a lot.

  • The first think I would think to: divide your sample in n subsamples of size k; if the experiments are independents, in sample number $i$ the number $X_i$ of successes is a $\mathcal Bin(k,p)$, so you have $n$ independent values $X_1, \dots, X_n$ of a binomial variable. This can be tested by a $\chi^2$ test. (The choice of $n,k$ will depend on the total sample size...)

  • Considering your remark about the process sticking for a number of trials, it seems that you think of positive correlation between successive experiments. I think the above test can be powerful in this case. However you can consider the length of runs in your sequence of trials: denoting $L_1$ (resp. $L_0$) the length of a 1 run (resp. of a 0 run), you have $$\mathbb P(L_1 = k) = p^{k-1} (1-p),$$ $$\mathbb P(L_0 = k) =(1-p)^{k-1} p.$$ These are geometric distributions, beware: in some softwares like R, $k$ is shifted by $1$. You can again test for goodness of fit of the observed values.

  • I don’t know how the preceding suggestion performs as compared to Wald–Wolfowitz runs test.

  • Special attention has been given to the case where $p={1\over 2}$, as this is the case of a sequence of random bits generated by a Random Number Generator. A number of tests has been devised by George Marsaglia, under the name Diehard test battery. You can surely generalize (most of) these tests to the case $p \ne {1\over 2}$. (But is it worth it?)

Elvis
  • 11,870
  • 36
  • 56
  • The runs test looks promising. I wonder how it compares in power to the alternatives suggested in this thread? – whuber Jan 04 '12 at 17:34
  • @whuber Yes, if my days were 96 hours long I would be right on it :) If you can read italian the article on the WW test is a bit more detailed. Before assessing the power, defining good alternative models is not so easy... the worst case is that of most modern RNG, fully deterministic periodic sequences for which these tests will have no power at all. – Elvis Jan 04 '12 at 20:19
8

Let the sequence of values be realizations of random variables $X_i$, $1\le i\le n$, each identically distributed as a Bernoulli($p$) variable (with $p$ unknown). When they are independent, the sequence is a Markov chain with transition probabilities

$$\Pr(x \to 0) = 1-p, \quad \Pr(x \to 1) = p$$

for $x = 0,1$. The sequence yields $n-1$ (presumably) independent transitions $X_i \to X_{i+1}$, $1\le i\le n-1$, which can be summarized as a $2 \times 2$ matrix of their counts. Independence of the sequence implies independence of this table. So, test the table for independence and reject the hypothesis of an independent sequence if the test is significant.

For example, here is a sequence of 24 binary outcomes:

1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 1 1 0 0 0

The 23 transitions in this sequence are 1->1, 1->1 , ..., 1->0, 0->0, 0->0. Their table of counts is

To:      0  1
From  0: 3  2
      1: 3 15

Its chi-squared statistic is 1.8947. The p-value (using the chi-squared approximation, which is not very good due to the small cell counts in the table) is 0.1687: no evidence of dependence. Indeed, these 24 values were generated randomly and independently with a 2/3 chance of the outcome 1 and a 1/3 chance of a 0 (i.e., $p=2/3$).

If the hypothesis of independence is not rejected, you can continue to test for higher-order dependence or seasonal dependence (by checking transitions from season to season, rather than between consecutive values).

Here is sample R code to compute the chi-squared test p-values for any desired order (1, 2, ...) and to simulate the null distribution (which can be used for a more accurate permutation test of the independence hypotheses if you wish).

set.seed(17)
x<-rbinom(256,1,2/3)             # Sample data generated with the null distribution.
cc <- function(x,k) {            # Chi-squared test of kth order independence in x.
    n <- length(x)-k-1
    m <- sapply(1:k, function(j) x[j:(n+j)])
    y <- m %*% (2^(0:(k-1)))     # Classifies length-k subsequences of x
    chisq.test(y, x[(k+1):length(x)])$p.value
}
sapply(1:2, function(k) cc(x,k)) # P-values for chi-squared tests of orders 1, 2.
order <- 1                       # Use 2 for second order, etc.
hist(replicate(999, cc(sample(x),order))) # Simulated null distribution of the p-value.
whuber
  • 281,159
  • 54
  • 637
  • 1,101
  • (+1) Nice. Very efficient way to compute the table of transitions. – Elvis Jan 04 '12 at 16:47
  • Thanks, @Elvis. I appreciate any constructive comments about the `R` code. – whuber Jan 04 '12 at 17:32
  • @whuber I am confused by this test. Could you elaborate how your calculate the chi-square test statistic from the table of counts in your example? – M.B.M. Nov 13 '14 at 05:42
  • @M.B.M This is the standard [Pearson $\chi^2$ test](http://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test). It is illustrated in the `R` code in this answer. The test of that particular table can be executed in `R` as `chisq.test(matrix(c(3,2,3,15),2))`, for instance. – whuber Nov 13 '14 at 06:24
  • 1
    @whuber Your test makes sense. The reason I asked my question is because the $\chi^2$ statistic 1.8947 that you calculated with `R` doesn't match the $\chi^2$ statistic I calculated by hand using the formula on the page you cite: $\frac{(3-5*6/23)^2}{5*6/23}+\frac{(2-5*17/23)^2}{5*17/23}+\frac{(3-18*6/23)^2}{18*6/23}+\frac{(15-18*17/23)^2}{18*17/23}=3.8101$. What am I doing wrong? (I don't use `R`, though I'm quite familiar with MATLAB.) – M.B.M. Nov 25 '14 at 18:09
  • 1
    @M.B.M. Good question. Your calculation is right--but `R` uses a continuity correction by default to create more accurate results with small amounts of data: "one half is subtracted from all $|O-E|$ differences; however, the correction will not be bigger than the differences themselves." In a $2\times 2$ table like this, all the $|O-E|$ differences are equal--in this case, they are $|3-5\times 6/23|\approx 1.6957\ge 1/2$. Thus the statistic is reduced by a factor of $(1 - 1/(2|3-5\times 6/23|))^2\approx 0.4972$, thereby turning $3.81$ into $3.81\times 0.4972\approx 1.8947$. – whuber Nov 25 '14 at 19:20
  • @whuber Aha! So `R` applies [Yate's correction for continuity](http://en.wikipedia.org/wiki/Yates%27s_correction_for_continuity) because there are small values in your table (like 2 and 3). If the table contained large values, then continuity correction is not necessary. Is that correct? – M.B.M. Nov 25 '14 at 19:55
  • @M.B.M. Yes, that's right (although I believe `R` would, by default, apply the continuity correction even to large values). – whuber Nov 25 '14 at 23:58
  • @whuber Thanks! I think I won't apply the continuity correction since the values in my tables are large (on the order of hundreds). – M.B.M. Nov 26 '14 at 04:28
  • This test seems to only account for lag 1 correlation, it might fail for lag k correlation. For example, consider the sequence replicating 1001: 1001100110011001.... – Statisfun Mar 12 '19 at 15:48
  • @Statisfun Thank you for that observation. But did you read my remarks about "higher-order dependence"? I suspect you might be describing the second argument to the `cc` function. – whuber Mar 12 '19 at 15:54