Applicability of chi-square test if many cells have frequencies less than 5

Question

To find association between peer's support (independent variable) and work satisfaction (dependent variable) I wish to apply chi-square test. Peer's support is categories in four groups according to the extent of support: 1=very less extent, 2=to some extent, 3=to great extent and 4=to very great extent. Work satisfaction is categories into two: 0=not satisfied and 1=satisfied.

The SPSS output says than 37.5 percent cell frequencies are less than 5. My sample size is 101 and I don't want to reduce categories in independent variable into lesser number. In this situation is there any other test that can be applied to test this association?

I'm not entirely sure how it's handled in higher dimensional tables like yours, but in the 2x2 case, the small sample analog to the chi-square is the Fisher Exact Test. I'd heard it's possible to use the FET in arbitrary r x c contingency tables, but that it was computationally intensive. Another option would be to do a permutation test. — Christopher Aden, Sep 04 '12 at 09:23
Given that both categories are ordinal, you could use a test that exploits that. See [Agresti, Analysis of Ordinal Categorical Data](http://www.amazon.com/Analysis-Ordinal-Categorical-Probability-Statistics/dp/0470082895/ref=sr_1_1?ie=UTF8&qid=1346755421&sr=8-1&keywords=agresti+analysis+of+ordinal) for various possibilities. — Peter Flom, Sep 04 '12 at 10:44
There is a form of Fisher's test for $R\times C$ tables that can be used whether or not the chi square test works asymptotically. There are other alternatives as well. See [Categorical Data Analysis by Agresti](http://www.amazon.com/Categorical-Analysis-Wiley-Probability-Statistics/dp/0471360937/ref=sr_1_1?s=books&ie=UTF8&qid=1346754928&sr=1-1&keywords=categorical+data+analysis). — Michael R. Chernick, Sep 04 '12 at 10:36
Why was my answer converted to a comment? It answers the question. — Michael R. Chernick, Sep 04 '12 at 12:35
@Michael Because it is not an answer: it is merely a hint followed by a (vague) pointer to an answer elsewhere. Please see the [SE FAQ about answers](http://meta.stackexchange.com/questions/118582/what-is-an-acceptable-answer). — whuber, Sep 04 '12 at 12:56
@whuber I disagree. The answer to the question is to use Fisher's test for RxC tables which I gave. I referred to Agresti for more details. — Michael R. Chernick, Sep 04 '12 at 14:27
You're welcome to discuss this on meta, @Michael, but not here. If you do open a discussion, I will maintain that "a form of" and "other alternatives" are too vague to be considered answers, as MånsT was gently trying to suggest. Sure, there is a gray area between answer status and comment status. As a moderator and reviewer I constantly am called to determine when would-be answers are really functioning as comments: this test of vagueness is one I attempt to consistently apply. — whuber, Sep 04 '12 at 14:31
@Braj-Stat, one thing to note is that the "requirement" (such as it is) for the chi-squared test is that *expected values* are >5 in all cells, *not* raw counts, although you may still violate that rule of thumb, &/or want to run a different test anyway. — gung - Reinstate Monica, Sep 04 '12 at 15:32
That's a good point, @gung. I recall someone else indicated in a comment (in another thread a few months ago) that chi-squared tests can remain accurate when a few cell populations are as low as $1$ provided they incorporate a continuity correction. (Comments are hard to search so I'm at a loss to find the exact quotation.) I have not personally verified this advice, but it is plausible. — whuber, Sep 04 '12 at 15:35
Oops, @whuber, I may have phrased that confusingly. I didn't mean that you *should* violate that rule of thumb, but rather that the *OP's data* may still violate that rule of thumb (ie EV's may be <5 in addition to raw counts <5). Although, I believe it is true that the requirement is much less strict than people believe, & it's possible the test is not badly harmed. — gung - Reinstate Monica, Sep 04 '12 at 17:02

RioRaider · Answer 1 · 2012-09-07T15:02:59.030

Conover (1999:202) suggested that the expected values can be "as small as 0.5, as long as most are greater than 1.0, without endangering the validity of the test."

He also provides a "rule of thumb" from Cochran (1952) which suggested that if expected values are less than 1 or if more than 20% are less than 5, the test may perform poorly. However, Conover (1999) provides some evidence that Cochran's "rule of thumb" is overly conservative.

References

Cochran, W. G. 1952. The $\chi^2$ test of goodness of fit. Annals of Mathematical Statistics 23:315-345.

Conover, W. J. 1999. Practical nonparametric statistics. Third Edition. John Wiley & Sons, Inc., New York, New York, USA.

abaumann · Answer 2 · 2012-09-05T09:36:05.403

The $\chi^2$-test was originally devised by Pearson as an approximation to the log-likelihood ratio, due to the fact that log-likelihoods were too computationally intensive for the time.

Pearson's G is defined as $G = 2\sum_{ij}O_{ij}\ln(O_{ij}/E_{ij})$. It follows the same distribution as the corresponding $\chi^2$-test.

(Forgot to mention originally: G is much less sensitive to expected cell counts < 5).

Applicability of chi-square test if many cells have frequencies less than 5

2 Answers2

Linked

Related