13

I just noticed how the non exact McNemar's test uses the $\chi^{2}$ asymptotic distribution. But since the exact test (for the two case table) relies on the binomial distribution, how come it is not common to suggest the normal approximation to the binomial distribution?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Tal Galili
  • 19,935
  • 32
  • 133
  • 195

2 Answers2

16

A close-to-intuitive answer:

Take a closer look at the formula for the McNemar test, given the table

      pos | neg
----|-----|-----
pos |  a  |  b
----|-----|-----
neg |  c  |  d

The McNemar statistic M is calculated as:

$$ M = {(b-c)^2 \over b+c} $$

The definition of a $\chi^2$ distribution with k degrees of freedom is that it consists of the sum of squares of k independent standard normal variables. if the 4 numbers are large enough, b and c, and thus b-c and b+c can be approximated by a normal distribution. Given the formula for M, it's easily seen that with large enough values M will indeed follow approximately a $\chi^2$ distribution with 1 degree of freedom.


EDIT : As onstop rightfully indicated, the normal approximation is in fact completely equivalent. That's rather trivial given the argument using the approximation of b-c by the normal distribution.

The exact binomial version is also equivalent to the sign test, in the sense that in this version the binomial distribution is used to compare b to $Binom(b+c,0.5)$. Or we can say that under the null hypothesis the distribution of b can be approximated by $N(0.5\times(b+c),0.5^2\times(b+c)$.

Or, equivalently:

$$\frac{b-(\frac{b+c}{2})}{\frac{\sqrt{b+c}}{2}}\sim N(0,1)$$

which simplifies to

$$ \frac{b-c}{\sqrt{b+c}}\sim N(0,1)$$

or, when taken the square on both sides, to $M \sim \chi^2_1$.

Hence, the normal approximation is used. It is the same as the $\chi^2$ approximation.

Joris Meys
  • 5,475
  • 2
  • 32
  • 43
  • 3
    That's right. The connection can perhaps be seen more clearly by considering Sqrt(M) = (b-c) / Sqrt(b+c). Approximating the variance of b as b and the variance of c as c (as is usual with counted data), we see that Sqrt(M) looks like an approximately normal variate (b-c) divided by its standard deviation: in other words, it looks like a *standard* normal variate. In fact, we could conduct an equivalent test by referring Sqrt(M) to a table of the standard normal distribution. Squaring it effectively makes the test symmetric two-tailed. Obviously this breaks down if either b or c is small. – whuber Nov 23 '10 at 17:11
  • Thank you for the intuitive answer Joris. Still, why is it more common to use this approximation rather then using the normal approximation to the exact binomial test of McNemar? – Tal Galili Nov 23 '10 at 18:08
  • @Tal: It's the same. See nonstops answer and my edit. – Joris Meys Nov 23 '10 at 21:10
  • Actually - last question. So if both are identical (and I think you might also need an "absolute value" around the b-c), then why do people go to the chi distribution instead of staying with the normal one? Where's the advantage? – Tal Galili Nov 24 '10 at 20:14
  • @Tal: You don't want that absolute value around it. N(0,1) can take negative values as well. Why take the chi2? A matter of taste I guess? chi2 is one-tailed, so tests two-sided by default. – Joris Meys Nov 24 '10 at 22:03
  • Dear Joris, you wrote " chi2 is one-tailed, so tests two-sided by default", how is that? I thought that once you get to the P value, you can move back and forth from two tails to one tail with simply multiplying or dividing the value with 2. – Tal Galili Nov 25 '10 at 06:25
  • 1
    @Tal: You know R. plot the chi2 with one degree of freedom, you'll see. – Joris Meys Nov 25 '10 at 09:02
  • Thanks Joris - I just got it (once you square it - it gives you the two sided test) – Tal Galili Nov 25 '10 at 16:30
8

Won't the two approaches come to the same thing? The relevant chi-square distribution has one degree of freedom so is simply the distribution of the square of a random variable with a standard normal distribution. I'd have to go through the algebra to check, which I haven't got time to do right now, but I'd be surprised if you don't end up with exactly the same answer both ways.

onestop
  • 16,816
  • 2
  • 53
  • 83
  • see my answer for further elaboration – Joris Meys Nov 23 '10 at 20:40
  • Hi onestop - Since both are asymptotic, then for smaller N's they might yield somewhat different results. In such a case, I wonder if the choice of going with chi-square is because it is better then the normal approximation, or because of historical reasons (or maybe, as you suggested - they always yield identical results) – Tal Galili Nov 23 '10 at 20:51
  • @Tal: for smaller N, neither of both hold. And as shown in my edit, they are exactly the same. – Joris Meys Nov 23 '10 at 21:12