2

When we do a chi-squared test (to test goodness of fit or the dependence of two variables), we assume that the the chi-squared statistic follows the chi-squared distribution.

  • Shouldn't we first check if the chi-squared statistic follows the chi-squared distribution in that particular case?
  • If yes, then how do we do that?
  • Or have I got it all mixed up and my question itself is wrong?
kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
  • Learn about the chi squared distribution. Hope this helps. https://www.youtube.com/watch?v=VGvVZmZ1g5c – Sahil Chaudhary Aug 28 '15 at 02:28
  • 1
    My post at http://stats.stackexchange.com/a/17148 answers these questions by detailing the assumptions that must be checked as well as illustrating what can go wrong when they don't hold. – whuber Aug 28 '15 at 14:17

1 Answers1

2

This is actually pretty straightforward. The chi-squared distribution is a distribution of continuous values. A chi-squared test statistic may or may not be able to take any positive real value. For example, the test statistic for a likelihood ratio test can take continuous values, but the test statistic from a chi-squared test of independence for a 2x2 contingency table can only take a finite set of discrete values. The former may match the theoretical distribution just fine, but the latter will be an approximation. If your sample is large enough, the approximation isn't a problem and the Yates' correction for continuity also helps a lot, so in practice it isn't usually something that you need to worry about often. To understand this further, it may help to read my answer here: Comparing and contrasting, p-values, significance levels and type I error.

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
  • 2
    I gave this +1, but a small niggle worries at me -- the small-sample distribution of an LRT (even when continuous) may not be very near chi-square. I'd suggest "may match" rather than "will match" as more generally correct. – Glen_b Aug 28 '15 at 02:52