I have a data set that looks like so:
+------+------------+------------+
| Time | Brad Event | Gary Event |
+------+------------+------------+
| 1 | N | Y |
| 2 | N | Y |
| 3 | y | N |
| 4 | N | Y |
| 5 | N | N |
| 6 | Y | N |
| 7 | N | N |
| 8 | N | N |
| ... | ... | ... |
+------+------------+------------+
I have found out that for n=400 time intervals:
- Brad has an event in 18% of them
- Gary has an event in 6% of them
My NULL hypothesis is that the probability of an event in these two series' are the same. My ALTERNATIVE hypothesis is that the the probability of an event in these two series' are different.
Normally this wouldn't be a problem to test, but my data breaks the independence assumption. Brad and Gary levels are independent of each other, but the occurrence of an event within a level may be influenced by a previous occurrence.
From a subjective look at the data, multiple events happen close together, then there is a pause and then multiple events again. When I looked at the duration between occurrences, these were not normally distributed (Histogram was all over the place). I may have time series dependence within the series.
Is there a statistical test that can be used to test for my NULL hypothesis even if it violates the independence assumption?