5

I am trying to find out how to conduct run tests on one sample but with a non 0.5-0.5 underlying occurrence pattern of the two dichotomous values.

For instance +++---++++, but assume that +'s had an underlying probability of 0.8 of happening whereas -'s had 0.2.

Is there a runs test which can address this?

(If so, is there an R function for it?)

Glen_b
  • 257,508
  • 32
  • 553
  • 939
appletree
  • 81
  • 2
  • Asking for code / packages is off-topic here. Did you try [Googling it](https://www.google.com/search?q=r+runs+test&oq=r+runs+test&gs_l=serp.3..0j0i30l2j0i8i30l7.141717.141717.0.141940.1.1.0.0.0.0.76.76.1.1.0.msedr...0...1c.1.64.serp..0.1.76.ijRxIo-8Iww)? – gung - Reinstate Monica Apr 02 '15 at 18:19
  • 1
    Yes of course. You can alternatively suggest a test that can conduct this analysis. – appletree Apr 02 '15 at 18:21
  • 1
    There's many different "runs test" test-statistics. What test statistic are you interested in using? The total number of runs of +'s and -'s? The number of runs of one sign? the length of the longest run? ... Note that the widely used [Wald-Wolfowitz](https://en.wikipedia.org/wiki/Wald%E2%80%93Wolfowitz_runs_test)\* runs test conditions on the total +/- counts, so doesn't assume p(+)=0.5. $\:\:$ \*(incorrectly attributed to them, but that's Stigler's law for you) – Glen_b Apr 03 '15 at 00:35
  • in R, `randtests::runs.test` (on CRAN) can do it (code `-` as `-1` and `+` as `1`, and set the threshold to `0`) – Glen_b Apr 03 '15 at 00:56

1 Answers1

5

If you're interested in a runs test of randomness (which I suppose from your tags, but your text doesn't indicate), the Wald-Wolfowitz runs test conditions on the totals of the +/- counts, so it doesn't assume $p(+)=\frac{1}{2}$.

The Wald-Wolfowitz test (e.g. see Stephens, 1939) is a permutation test based on the total number of runs of both kinds, conditional on the total number of symbols of each kind.

I presume you're interested in too few runs (the usual case), which indicates "clumping of signs".

Your example data has 3 runs, with $n_+=7$ and $n_-=3$. There are 120 arrangements of the symbols, of which 2 have 2 runs (all + followed by all - and vice versa) and of the 3-run cases there are 2 with 7 "+"s in the center and 6 with 3 "-"s in the center, for a total of 10 cases with 3 or fewer runs:

2 runs:
--- +++++++
+++++++ ---

3 runs:
- +++++++ --
-- +++++++ -

+ --- ++++++
++ --- +++++
+++ --- ++++
++++ --- +++
+++++ --- ++
++++++ --- +

so the exact p-value of that case is 10/120.

It's a widely used test, most stats packages offer it in some way, though some only offer the asymptotic approximation (which works better with larger $n_+$ and $n_-$).

If you're interested in testing something else or using some different statistic, you'll need to say more.

(In R, randtests::runs.test (on CRAN) can do the Wald-Wolfowitz test (code - as -1 and + as 1, and set the threshold to 0); if your sample size is large enough to use the asymptotic approximation, it's easy to code by hand, in any case)

[1] Stevens W.L. (1939),
"Distribution of groups in a sequence of alternatives"
Annals of Eugenics 9:10-17

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • Thank you very much. Great to confirm this. Do you by any chance have any suggestions on a runs test where the underlying distribution can be given and a comparison is made to it? – appletree Apr 03 '15 at 20:11
  • What do you intend by 'the distribution can be given'? The pmf for the Wald-Wolfowitz statistic is given in many places, not least the reference in my answer. That same reference has a 'runs of one kind' test which has a pmf with a 'standard' distribution (hypergeometric), but I wouldn't normally recommend it over the Wald-Wolfowitz. – Glen_b Apr 04 '15 at 00:13