1

I have count-data of patients with an event which occurred on different days in the week. For example:

Event occurred on:
Monday Tuesday Wednesday
28     36      48

Let's assume all events together represent the number of patients.

n=112

My question is, is it valid to do the follwing:
1. Build a two row contingency table while defining the second row as:

patients with no event on day x = n - events on day x

This would result in the following two row table:

                       Monday   Tuesday   Wednesday
Event occurred         28       36        48
Event not occured      84       76        64 

2. To use the chi-square test on this table, e.g. to check whether the null-hypothesis of no difference in events between Monday and Wednesday is True or False. For example in R (assume the contingency table is named cont1):

chisq.test(x = cbind(cont1[,1], cont1[,3]), correct = F)

I know that I can use the goodness of fit test to check this. I already asked a question about this here, suggesting that I would check the null-hypothesis of no difference in events per hour between Monday and Wednesday. I hope I got this right.

Interestingly, the expected values between the two tests do not differ but the p-values do. I'm sure these are two different tests but is it valid to use both? There already was a thread about this (here), unfortunately with no satisfying answer.

Tobias
  • 153
  • 1
  • 6

1 Answers1

0

The state space of your model is

$$\{(\text{Event occured},\text{Monday}), (\text{Event not occured},\text{Monday}), (\text{Event occured},\text{Wednesday}), (\text{Event not occured},\text{Wednesday})\}$$

and the output of chisq.test will give you an indication whether your $224$ samples from the true distribution on your state space provide evidence against the null hypothesis. Your null hypothesis is that the random variables $X$ (Event occured or not ) and $Y$ (Monday or Wednesday) are stochastic independent. But if the $p$-value is large, than you cannot conclude that the null hypothesis is correct!

Based on the size of your contingency table, I think you can also try Fischer's exact test instead of Pearson's chi-squared test, which is implemented in R in the algstat-package (see also the manual here).

Hope that helps.

Tobias Windisch
  • 542
  • 4
  • 10
  • 1
    It helps. Thanks. You wrote: **But if the p-value is large, than you cannot conclude that the null hypothesis is correct!** Does this imply that I cannot trust a low p-value as well? And what exactly means large? A p-value > 0.05 in general? Or do you just refer to the 5%-threshold of rejecting or holding the null hypothesis? – Tobias Apr 14 '16 at 08:38
  • A small $p$-value is a good indicator that the Null hypothesis is false. But you cannot say that the Null hypothesis is correct if the $p$-value is large. Typically, a $p$-value below $0.05$ is small, but that depends on the application I guess. – Tobias Windisch Apr 14 '16 at 09:12
  • Ok, then you refer to the threshold in general. I was thinking that you wrote about some, to me, unknown considerations of the chi-squared test. Thanks again. – Tobias Apr 14 '16 at 12:12