6

First of all I have read the answers to this question, but I'm not happy with them, I feel that they miss the point that I'm willing to address here.

I'm looking at a chi-squared test for independence of two dichotomous variables. Let's say the categories are $A, B$ for variable 1 and $a, b$ for variable 2.

I interpret the test as telling me how far the proportion of $a$ among $A$ is from the proportion of $a$ among $B$. It could be that $P(a|A)<P(a|B)$ and it could be the opposite, and if I look at the usual $\frac{(O-E)^2}{E}$ test, it's going to reject both directions: the most extreme $5\%$ of tables where $P(a|A)<P(a|B)$ as well as the most extreme $5\%$ of tables where $P(a|A)>P(a|B)$, which together make up the most extreme $5\%$ of all tables.

If you're willing to test only one of these directions, to me it makes perfect sense to use a $10\%$ level of significance and reject the null hypothesis only if the inequality goes the way you predicted. The argument that the $\chi^2$ distribution is asymmetrical (has only one tail that encompasses both extreme situations) looks artificial to me: if you really insist that this matters you could pretty much introduce a new statistic called $\pm\chi^2$ that is the same as $\chi^2$, except you add a minus sign when, say, $P(a|A)<P(a|B)$. Then the curve is symmetrical and does the expected job. What is wrong with this point of view?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Arnaud Mortier
  • 604
  • 3
  • 13
  • You may have intended $2.5\%$ rather than $10\%$, which keeps the critical point in a normal test near two standard deviations, Fisher's rule of thumb for further investigation. I have sympathy for this, as it discourages arbitrary switching between two- and one-tailed test to get a "more significant" result – Henry Mar 03 '20 at 08:27
  • 1
    In some hypothesis tests, a low $\chi^2$ statistic is saying the observed data is excessively close to the expected values. If you reject for this reason (e.g. Mendel's peas), then you are really rejecting the hypothesis that this was a random sample rather than the model was wrong – Henry Mar 03 '20 at 08:28

0 Answers0