60

A published article (pdf) contains these 2 sentences:

Moreover, misreporting may be caused by the application of incorrect rules or by a lack of knowledge of the statistical test. For example, the total df in an ANOVA may be taken to be the error df in the reporting of an $F$ test, or the researcher may divide the reported p value of a $\chi^2$ or $F$ test by two, in order to obtain a one-sided $p$ value, whereas the $p$ value of a $\chi^2$ or $F$ test is already a one-sided test.

Why might they have said that? The chi-squared test is a two-sided test. (I have asked one of the authors, but gotten no response.)

Am I overlooking something?

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
Joel W.
  • 3,096
  • 3
  • 31
  • 45
  • see http://stats.stackexchange.com/questions/171074/chi-square-test-why-is-the-chi-squared-test-a-one-tailed-test/171084#171084 –  Sep 13 '15 at 06:59
  • Look at exercise 4.14 of Davidson & Mackinnon 'Econometric Theory and Methods' 2004 edition for an (exceptional) example of when the Chi-squared is used for a two-tailed test. Edit: great explanation here: http://www.itl.nist.gov/div898/handbook/eda/section3/eda358.htm – Max Apr 29 '13 at 06:32
  • There's at least one case where it makes sense to talk about a one-sided chi-squared: when you have two dichotomous variables. I give more details [here](https://stats.stackexchange.com/questions/447625/a-specific-example-of-two-sided-chi-squared-test). – Arnaud Mortier Feb 12 '20 at 10:47

7 Answers7

64

The chi-squared test is essentially always a one-sided test. Here is a loose way to think about it: the chi-squared test is basically a 'goodness of fit' test. Sometimes it is explicitly referred to as such, but even when it's not, it is still often in essence a goodness of fit. For example, the chi-squared test of independence on a 2 x 2 frequency table is (sort of) a test of goodness of fit of the first row (column) to the distribution specified by the second row (column), and vice versa, simultaneously. Thus, when the realized chi-squared value is way out on the right tail of it's distribution, it indicates a poor fit, and if it is far enough, relative to some pre-specified threshold, we might conclude that it is so poor that we don't believe the data are from that reference distribution.

If we were to use the chi-squared test as a two-sided test, we would also be worried if the statistic were too far into the left side of the chi-squared distribution. This would mean that we are worried the fit might be too good. This is simply not something we are typically worried about. (As a historical side-note, this is related to the controversy of whether Mendel fudged his data. The idea was that his data were too good to be true. See here for more info if you're curious.)

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
  • 9
    +1 for mentioning the two-sided use with Mendel's pea experiments: it's memorable and gets to the heart of the question. – whuber Feb 06 '12 at 17:01
  • I see what you are saying about goodness of fit, Jon, but consider this. Let's say we are comparing survival rates for groups A and B. The survival rate for group A could be higher than the survival rate group B or it could be lower. Why is that not a 2 tailed test? – Joel W. Feb 06 '12 at 17:02
  • 3
    +1 for a good question and an excellent answer. @Joel W: I can strongly recommend Khan Academys video on the [$\chi^2$ test](http://www.khanacademy.org/video/chi-square-distribution-introduction?topic=statistics) – Max Gordon Feb 06 '12 at 17:15
  • @JoelW., A could be higher than B, or B > A, but in both cases, when using the chi-squared test, you are asking if the two distributions are good fits for each other; in neither case are you worried that the numbers might match too identically. – gung - Reinstate Monica Feb 06 '12 at 17:23
  • @JoelW to expand on gung's comment on your comment, in both cases the result is a test statistic that is too *high* for the distribution under a null hypothesis, which is why we only check the value on the right tail. – Peter Ellis Feb 06 '12 at 18:48
  • Yes, Peter, in that sense we see the Chi-Sq as one-sided. But, we can be interested in both ways the results can deviate from the null hypothesis (A>B and B>A). In that sense the test is two-sided. If we were only interested in outcomes in one direction (e.g., is the new treatment, A, more effective than the old one, B) would it not be appropriate to modify the alpha level to reflect this directional alternative hypothesis? – Joel W. Feb 06 '12 at 19:07
  • 1
    @JoelW., 2 points here. 1) You may certainly be interested in both orderings of the groups, but this is not the same thing as being interested in whether the chi-squared statistic falls in either tail of its distribution. Whether A>>B (ie much bigger), or B>>A, the chi-squared statistic would fall far into the *right* tail of the dist. (2) You may adjust the alpha level for your studies as you like (provided this is done before gathering data & the process/reasoning is clearly stated), but that is very different from "dividing the p value by 2", which would not be valid. – gung - Reinstate Monica Feb 06 '12 at 19:29
  • To illustrate: saying eg, 'gathering a lot of data would be difficult, so we set $\alpha=.10$, our p=.06", would be perfectly valid. But, finding p=.06, dividing it by 2, and saying, 'we set $\alpha=.05$, our p=.03" would not be valid. In this way, both you and the authors are correct. – gung - Reinstate Monica Feb 06 '12 at 19:34
  • 10
    My summary of this is that the $\chi^2$ is a two-sided test for which we are usually interested in only one of the tails of the distribution, indicating more disagreement, rather than less disagreement than one expects by chance. – Frank Harrell Feb 06 '12 at 21:50
  • 5
    Supporting the 2-tailed view: "The two-tail probability beyond +/- z for the standard normal distribution equals the right-tail probability above z-squared for the chi-squared distribution with df=1. For example, the two-tailed standard normal probability of .05 that falls below -1.96 and above 1.96 equals the right-tail chi-squared probability above (1.96)squared=3.84 when df=1." Agresti, 2007 (2nd ed.) page 11 – Joel W. Feb 07 '12 at 02:30
  • 5
    That's right. Squaring a z-score yields a chi-squared variate. For example, a z of 2 (or, -2!) when squared equals 4, the corresponding chi-squared value. The two-tailed p-value associated with a z-score of 2 is .04550026; and the one-tailed p-value associated with a chi-squared value of 4 (df=1) is .04550026. A two-tailed z test corresponds to a one-tailed chi-squared test. Looking at the left tail of the chi-squared distribution would correspond to looking for z-scores that are closer to z=0 than you might expect by chance. – gung - Reinstate Monica Feb 07 '12 at 02:52
  • The key phrase in that quote is "equals the *right-tail* probability... for the chi-squared distribution". – gung - Reinstate Monica Feb 07 '12 at 03:01
  • If a non-directional (aka 1 tailed) chi-sq is equivalent to a TWO tailed z test, and you want to use the chi-sq in lieu of a ONE tailed z test, it would seem that you should change the alpha level of the chi-sq test so it reflects the square of a ONE tailed z test. So, if the critical value for a 1 tailed z test is 1.65, then the corresponding critical value for chi sq is that square of that, or 2.755 (not 3.84). So, the authors' statement that is is an error to adjust a 3.84 chi-sq seems incorrect. Or am I missing something? – Joel W. Feb 07 '12 at 16:30
  • You can set alpha wherever you like, so long as it's done a-priori & clearly explained when results are reported. Setting the critical value for chi-square to some specific # is the same as setting alpha--done beforehand & stated explicitly it's perfectly valid. Thus, one could say "we set alpha at .10, our critical chi-square is 2.7, our obtained p is .06", & that's OK, but one could not say "we set alpha at .05, our critical chi-square is 3.8, our obtained p is .03". Doubling the alpha is fine, dividing p by 2 is not. As I said above, in this sense, both you & the authors are right. – gung - Reinstate Monica Feb 07 '12 at 17:52
  • Hmm. If I understand what you are saying, you can do a z-test (say, comparing proportions) and report a one-tailed p value, but you cannot do the mathematically identical chi-square and report that same one-tailed p value. This does not make sense to me. – Joel W. Feb 10 '12 at 00:28
  • I'm sorry this is causing so much trouble; some topics in statistics just aren't very intuitive. Nonetheless, we've been going around and around on this for some time, and there's going to be a limit on how much I can clarify this for you in comments. You may want to see if you can work with a professional statistician, or take some formal classes, to get further assistance. The Khan Academy videos recommended above might be a good place to start. Good luck. – gung - Reinstate Monica Feb 10 '12 at 04:59
  • Let's say an engineer told me that by one calculation a bridge could be determined to be capable of supporting X tons and by a mathematically equivalent calculation the bridge could be determined to be capable of supporting 2X tons. Let's say the engineer further told me both answers are correct, it just depends on the approach you took. I would find that puzzling, too. Mathematically equivalent approaches logically should not result in different conclusions. – Joel W. Feb 13 '12 at 17:06
  • @gung I just came across this great post of yours, which made me realize that I don't understand the concept of goodness-of-fit as compared to other statistical tests. I don't see this addressed straight on in the site, and the Wikipedia entry is not that great (more a list). Can you suggest if and how it would be a good way to ask this? "Intuition..."? – Antoni Parellada Sep 12 '15 at 17:38
  • @AntoniParellada, it's hard to say without knowing what your question is. You might just ask & the question can be refined with comments if necessary. – gung - Reinstate Monica Sep 12 '15 at 17:45
  • what about in this case http://stats.stackexchange.com/questions/223560/how-to-define-a-rejection-region-when-theres-no-ump ? – An old man in the sea. Jul 16 '16 at 09:44
  • @Anoldmaninthesea., you can use a 2-tailed test in that case, note RayKoopman's answer below. – gung - Reinstate Monica Jul 16 '16 at 12:33
  • gung, you're right. However, in both, I don't see a worry over the possibility that the fit is too good that would justify the two-tailed test. Then why the two-tailed test? – An old man in the sea. Jul 16 '16 at 14:27
  • @Anoldmaninthesea., if you're testing a variance against a null value there is no fit that's 'too good'. The observed variance could be *too low* to have come from the reference value, or *too high* to have done so, but this is different from a fit that's *too poor* or *too good*. – gung - Reinstate Monica Jul 16 '16 at 14:45
  • gung, but in that case why not choose a one-tailed with same test size? it seems that there's no consideration w.r.t. power. – An old man in the sea. Jul 16 '16 at 15:09
  • @Anoldmaninthesea., if you want to test if the observed variance is *less than* some value, you should use a 1-tailed test. If you want to test if the observed variance is *not equal to* some value, you need to do a 2-tailed test. – gung - Reinstate Monica Jul 16 '16 at 15:14
14

Is chi-squared always a one-sided test?

That really depends on two things:

  1. what hypothesis is being tested. If you're testing variance of normal data against a specified value, it's quite possible to be dealing with the upper or lower tails of the chi-square (one-tailed), or both tails of the distribution. We have to remember that $\frac{(O-E)^2} E$ type statistics are not the only chi-square tests in town!

  2. whether people are talking about the alternative hypothesis being one- or two-sided (because some people use 'two-tailed' to refer to a two-sided alternative, irrespective of what happens with the sampling distribution of the statistic. This can sometimes be confusing. So for example, if we're looking at a two-sample proportions test, someone might in the null write that the two proportions are equal and in the alternative write that $\pi_1 \neq \pi_2$ and then speak of it as 'two-tailed', but test it using a chi-square rather than a z-test, and so only look at the upper tail of the distribution of the test statistic (so it's two tailed in terms of the distribution of the difference in sample proportions, but one tailed in terms of the distribution of the chi-square statistic obtained from that -- in much the same way that if you make your t-test statistc $|T|$, you're only looking at one tail in the distribution of $|T|$).

Which is to say, we have to be very careful about what we mean to cover by the use of 'chi-square test' and precise about what we mean when we say 'one-tailed' vs 'two-tailed'.

In some circumstances (two I mentioned; there may be more), it may make perfect sense to call it two-tailed, or it may be reasonable to call it two-tailed if you accept some looseness of the use of terminology.

It may be a reasonable statement to say it's only ever one-tailed if you restrict discussion to particular kinds of chi-square tests.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • what about this one? http://stats.stackexchange.com/questions/223560/how-to-define-a-rejection-region-when-theres-no-ump – An old man in the sea. Jul 16 '16 at 09:45
  • Thank you very much for mentionning the variance test. That is actually a quite interesting use of the test, and also the reason why I ended up on this page ^^ – Tobbey Sep 09 '19 at 14:04
7

The chi-square test $(n-1)s^2/\sigma^2$ of the hypothesis that the variance is $\sigma^2$ can be either one- or two-tailed in exactly the same sense that the t-test $(m-\mu)\sqrt{n}/s$ of the hypothesis that the mean is $\mu$ can be either one- or two-tailed.

Ray Koopman
  • 2,143
  • 12
  • 6
1

I also have had some problems to come to grips with this question as well, but after some experimentation it seemed as if my problem was simply in how the tests are named.

In SPSS as an example, a 2x2 table can have an addition of a chisquare-test. There there are two columns for p-values, one for the "Pearson Chi-Sqare", "Continuity Correction" etc, and another pair of columns for Fisher's exact test where there are one column for a 2-sided test and another for a 1-sided test.

I first thought the 1- and 2-sides denoted a 1- or 2-sided version of the chisquare test, which seemed odd. It turned out however that this denotes the underlying formulation of the alternate hypothesis in the test of a difference between proportions, i e the z-test. So the often reasonable 2-sided test of proportions is achieved in SPSS with the chisquare test, where the chisquare measure is compared with a value in the (1-sided) upper tail of the distribution. Guess this is what other responses to the original question already have pointed out, but it took me some time to realize just that.

By the way, the same kind of formulation is used in openepi.com and possibly other systems as well.

Robert L
  • 117
  • 7
  • see http://stats.stackexchange.com/questions/171074/chi-square-test-why-is-the-chi-squared-test-a-one-tailed-test/171084#171084 –  Sep 13 '15 at 06:56
1

@gung's answer is correct and is the way discussion of $\chi^2$ should be read. However, confusion may arise from another reading:

It would be easy to interpret a $\chi^2$ as 'two-sided' in the sense that the test statistic is typically composed of a sum of squared differences from both sides of an original distribution.

This reading would be to confuse how the test statistic was generated with which tails of the test statistic are being looked at.

conjectures
  • 3,971
  • 19
  • 36
  • Could you elaborate on what a "side of an original distribution" would be? It's not even evident what that "original distribution" refers to nor how it is related to the chi-squared statistic as computed from data. – whuber Jun 15 '15 at 16:29
  • For example, a sum of $n$ independent normals squared is $\chi^2$. The normals are the 'original' distribution. The $\chi^2$ stat incorporates information from both tails of the underlying normal distribution. – conjectures Jun 15 '15 at 16:32
  • OK, but I still cannot figure out what you are contrasting that with. Could you provide an example of a non-two-sided test statistic that could be used in ANOVA and show how it is connected with the tails of some distribution? – whuber Jun 15 '15 at 16:39
  • I'm not contrasting it with anything. I'm pointing out a reason why people might get confused about the one-sided/two-sided jargon in the context of $\chi^2$. It's straightforward for experts to see that the $\chi^2$ test itself is usually a one-sided test on the calculated stat. Others may have some data and be thinking about deviations from the mean in both directions, which often get rolled up into a $\chi^2$ stat. They will have heard things along the lines of 'thinking of deviations from the mean in both directions=two-sided test'. Hence a misunderstanding. – conjectures Jun 15 '15 at 16:52
  • I'm asking for a contrast only to help understand what you are trying to describe. I haven't been able to determine what that is yet. – whuber Jun 15 '15 at 17:08
  • The misunderstanding has nothing in particular to do with ANOVA, but everything to do with why someone might have the idea that, 'The chi-squared test is a two-sided test.' To reiterate because the OP may have in mind that the $\chi^2$ stat sums deviations from the mean in both directions. – conjectures Jun 15 '15 at 17:14
  • see http://stats.stackexchange.com/questions/171074/chi-square-test-why-is-the-chi-squared-test-a-one-tailed-test/171084#171084 –  Sep 13 '15 at 06:57
0

$\chi^2$ test of variance can be one or two sided: The test statistic is $(n-1)\frac{s^2}{\sigma^2}$, and the null hypothesis is: s (sample deviation)= $\sigma$ (a reference value). The alternative hypothesis could be: (a) $ s> \sigma$, (b) $s < \sigma$, (c) $s \neq \sigma$. p-value caculation involves the asymmetry of the distribution.

shahuss
  • 1
  • 2
  • 2
    Welcome to CV! I think [Ray Koopman's answer](http://stats.stackexchange.com/a/69729/22228) already covers this point. – Silverfish Sep 12 '15 at 13:56
-2

The $\chi^2$ and F tests are one sided tests because we never have negative values of $\chi^2$ and F. For $\chi^2$, the sum of the difference of observed and expected squared is divided by the expected ( a proportion), thus chi-square is always a positive number or it may be close to zero on the right side when there is no difference. Thus, this test is always a right sided one-sided test. The explanation for F test is similar.

For the F test, we compare between group variance to sum of within group variances ( mean square error to $\frac{SSw}{dfw}$. If the between and within mean sum of squares are equal we get an F value of 1.

Since it is essentially the ratio of sum of squares, the value never becomes a negative number. Thus, we don't have a left sided test and F test is always a right sided one sided test. Check the figures of $\chi^2$ and F distributions, they are always positive.For both tests, you are looking at whether the calculated statistic lies to the right of the critical value.

Chi-square and F Distributions

Ferdi
  • 4,882
  • 7
  • 42
  • 62
Daniel
  • 1
  • 1
  • 2
    A test statistic doesn't need to take negative values for us to consider both tails. Consider an F test for the ratio of two variances, for example. – Glen_b Mar 03 '17 at 06:25
  • F test is one sided test Glen_b. – Daniel Mar 03 '17 at 06:49
  • 3
    The F test for equality of variances, which has a statistic that's the ratio of the two variance estimates is NOT one sided; there's an approximation to it which places the larger of the two sample variances on the numerator, but it's only really right if the df are the same. But if you don't like that there's any number of other examples. The statistic for the rank sum test cannot be negative but the test is two tailed.I can supply other examples if needed. – Glen_b Mar 03 '17 at 07:19
  • @Ferdi Unfortunately there's something clearly wrong with the example there -- it says it's two-sided but then implies it only rejects for large values of the statistic. If $\sigma_1^2$ was less than $\sigma_2^2$ we'd be almost never observing a large value for the ratio, so the statistic would only tend to reject when $\sigma_1^2>\sigma_2^2$ making it one-sided. – Glen_b Mar 03 '17 at 11:27