Balanced design (1000 of each type): If I toss one coin 1000 times and get 1000 Heads, and I toss another
coin 1000 times and get 1 Head, then I am comfortable concluding
that the two coins behave differently. The Minitab output from a two-by-two table is as shown below: Both the
chi-squared and Fisher tests show a tiny P-value $< 0.05.$ (In Minitab,
P-values are rounded to three places, so 0.000
means a value less than $0.0005.)$
Test and CI for Two Proportions
Sample X N Sample p
1 1 1000 0.001000
2 1000 1000 1.000000
Difference = p (1) - p (2)
Estimate for difference: -0.999
95% CI for difference: (-1, -0.997041)
Test for difference = 0 (vs ≠ 0):
Z = -44.68 P-Value = 0.000
* NOTE * The normal approximation may be inaccurate for small samples.
Fisher’s exact test: P-Value = 0.000
Formally, your gene data in Sand and Water follow the same statistical analysis. But for me, the strength of evidence is not the same: Finding a gene is technically harder than seeing whether a coin shows H or T.
I would want to be sure that the gene ID procedure works exactly the same for
wet and relatively dry specimens. Also, that sampling methods for specimens in sand and water are equivalent.
Unbalanced design (one vs. 1000): Also, it isn't clear from the table how many Sand specimens there are.
If it's one in Sand and 1000 in Water (as your last sentence suggests), then neither test shows significance: The chi-squared test can't be completed (because the sample size in Sand is too small) and Fisher's test shows P-value $1.$
Test and CI for Two Proportions
Sample X N Sample p
1 1 1 1.000000
2 1000 1000 1.000000
Difference = p (1) - p (2)
Estimate for difference: 0
95% CI for difference: (*, *)
Test for difference = 0 (vs ≠ 0):
Z = * P-Value = *
* NOTE * The normal approximation may be inaccurate for small samples.
Fisher’s exact test: P-Value = 1.000
Often, one rejects the null hypothesis that the proportions are equal with a P-value below 5%. However, a P-value near 1 often indicates
that the model is wrong or that the data were not collected as described.