12

Suppose we have the following data set:

                Men    Women    
Dieting         10      30
Non-dieting     5       60

If I run the Fisher exact test in R then what does alternative = greater (or less) imply? For example:

mat = matrix(c(10,5,30,60), 2,2)
fisher.test(mat, alternative="greater")

I get the p-value = 0.01588 and odds ratio = 3.943534. Also, when I flip the rows of the contingency table like this:

mat = matrix(c(5,10,60,30), 2, 2)
fisher.test(mat, alternative="greater")

then I get the p-value = 0.9967 and odds ratio = 0.2535796. But, when I run the two contingency table without the alternative argument (i.e., fisher.test(mat)) then I get the p-value = 0.02063.

  1. Could you please explain the reason to me?
  2. Also, what is the null hypothesis and alternative hypothesis in the above cases?
  3. Can I run the fisher test on a contingency table like this:

    mat = matrix(c(5000,10000,69999,39999), 2, 2)
    

PS: I am not a statistician. I am trying to learn statistics so your help (answers in simple English) would be highly appreciated.

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
snape
  • 149
  • 1
  • 3
  • 9

1 Answers1

12

greater (or less) refers to a one-sided test comparing a null hypothesis that p1=p2 to the alternative p1>p2 (or p1<p2). In contrast, a two-sided test compares the null hypotheses to the alternative that p1 is not equal to p2.

For your table the proportion of dieters that are male is 1/4 = 0.25 (10 out of 40) in your sample. On the other hand, the proportion of non-dieters that are male is 1/13 or (5 out of 65) equal to 0.077 in the sample. So then the estimate for p1 is 0.25 and for p2 is 0.077. Therefore it appears that p1>p2.

That is why for the one-sided alternative p1>p2 the p-value is 0.01588. (Small p-values indicate the null hypothesis is unlikely and the alternative is likely.)

When the alternative is p1<p2 we see that your data indicated that the difference is in the wrong (or unanticipated) direction.

That is why in that case the p-value is so high 0.9967. For the two-sided alternative the p-value should be a little higher than for the one-sided alternative p1>p2. And indeed, it is with p-value equal to 0.02063.

unutbu
  • 569
  • 1
  • 6
  • 10
Michael R. Chernick
  • 39,640
  • 28
  • 74
  • 143
  • 1
    Fantastic explanation. So, the fisher exact test actually compares probabilities between rows as opposed to columns? – Christian Oct 12 '17 at 22:21
  • @Christian: No, it doesn't matter whether its rows or columns as the fisher test checks for correlation in a contingency table. Rows and columns don't matter directly. You could also just reformulate the hypothesis: instead H0 being "people who smoke die younger" you could also be assuming H0: "people who die younger are more likely to smoke". The results of the fisher test would tell you whether any observed connection in the data supports the null-hypothesis or not, but it doesn't matter which is the independent or dependent variable and equally the choice of rows/columns doesn't matter :) – Dominique Paul Apr 26 '20 at 07:06