2

In the textbooks I have access to (and that discuss hypothesis testing for correlation), I only met examples, where the null-hypothesis was $\rho=0$, and the alternative hypothesis was $\rho\ne 0$. My question is about using a one-sided alternative hypothesis $\rho>0$. Is this meaningful?

This question has been asked before, but it has not been answered. There was a comment next to the linked question, that said that the null-hypothesis should be $\rho\le 0$ in case we would like a one-sided alternative hypothesis, but I have problems with this comment. As I understand, the t-distribution that is used for testing the correlation coefficient is only valid when $\rho=0$, so we have no choice, but using this as the null-hypothesis.

So, to summarize: can we test $H_0:\rho=0$ against $H_1:\rho>0$ using $R\sqrt{\dfrac{n-2}{1-R^2}}$ and the t-distribution with degree of freedom $n-2$?

Ferenc Beleznay
  • 163
  • 1
  • 7
  • I don't agree with the premise of this question. "Correlation" is a symmetric measure of association, at least in terms of a Pearson or Spearman correlation -- the most common uses of the term. – Mike Hunter Jul 27 '16 at 12:07
  • 3
    Can you please explain a bit this comment? Correlation is indeed symmetric measure of association, but it can be positive or negative. – Ferenc Beleznay Jul 27 '16 at 12:16
  • If we agree that it is a symmetric measure, then directionality does not apply, *ipso facto*. – Mike Hunter Jul 27 '16 at 12:28
  • 1
    See additional comments (posted just now) under the linked question. – amoeba Jul 27 '16 at 12:38
  • @DJohnson: I take symmetric to mean that the correlation of $x$ with $y$ equals the correlation of $y$ with $x$. Nonetheless, it's directional in the sense that if $y$ increases as $x$ increases the correlation is *positive*; if $y$ decreases as $x$ increases the correlation is *negative*: the sense relvant to this question. – Scortchi - Reinstate Monica Jul 27 '16 at 12:46
  • @scortchi Right you are. Maybe the more practical question is why anyone would want to use a correlation coefficient to test this hypothesis when a regression-based approach is direct and intuitive. The confusion manifest in the additional comments wrt the *interpretation* of the results is evidence for this concern. – Mike Hunter Jul 27 '16 at 12:52
  • 1
    See [Justification of one-tailed hypothesis testing](http://stats.stackexchange.com/q/7853/17230) for how to think about the distribution of the test statistic under a null hypothesis that isn't of the simple form $\theta=0$ (or another exactly specified value). – Scortchi - Reinstate Monica Jul 27 '16 at 13:01
  • @DJohnson: I am a high school teacher, trying to understand what is a proper way of teaching statistics to my students. These kind of questions (meaningful or not) appear on exams. As for your comment on the confusion in my other comment, I understand you have a concern, but I still did not get an answer. – Ferenc Beleznay Jul 27 '16 at 13:08
  • I laud your zeal in running down answers to every possible question nuance your students may encounter. However, if the stats text you have discuss only one type of hypothesis test, does that tell you something about the kinds of questions your students are likely to get wrt correlation? in this instance is it possible that you are being overly zealous? Moreover, if you aren't getting answers to a very clear question, what does that suggest? To me it suggests that, 1) this specific question isn't likely to appear on a test, and 2) if no one can explain it, perhaps it doesn't make sense. – Mike Hunter Jul 27 '16 at 13:28
  • @DJohnson: Unfortunately I don't know how to add an image, so I type in the question that did appear on a high school final exam. Several thousands students needed to answer this, and as teachers, we need to explain them the answer: – Ferenc Beleznay Jul 27 '16 at 13:54
  • Ah! That is surprising. If answers aren't forthcoming from this site and given that this question is now in the public domain, I wonder how the test developers would respond. They should be free to provide answers. Have you tried reaching out to that group? – Mike Hunter Jul 27 '16 at 13:58
  • 1
    @DJohnson: Sorry, I pressed enter and I cannot edit my previous post. So here is the question: A company claims, that travelling distance to work is independent of salary. To test this, 20 employees are asked about salary and travel distance. For this sample, r=-0.35 was found. Perform a one-tailed test at the 5% significance level to test whether the travel distance and salary are independent. – Ferenc Beleznay Jul 27 '16 at 14:02
  • Seems like a perfectly good question - apart from the fact that the direction of the one-tailed test isn't specified! As discussed [Interpreting one- and two-tailed tests](http://stats.stackexchange.com/q/108078/17230), you don't want the direction of the observed correlation to determine the direction of the test, else you ought to be performing the two-tailed test. – Scortchi - Reinstate Monica Jul 27 '16 at 16:02
  • @Scortchi: In the example above the p-value turned out to be such, that the null-hypothesis was not rejected, so my answer would have been: "there is no evidence to suggest that travel distance and salary are not independent". Maybe I am too picky, but I would still like to know what would be the interpretation of the result if the same question would have been asked with r=-0.9? Is it: "there is reason to believe, that travel distance and salary are not independent" or "there is evidence to suggest, that longer travel time is associated with lower salary"? – Ferenc Beleznay Jul 27 '16 at 17:12
  • Well, I'd still answer that the direction of the one-tailed test needs to be specified before-hand. And I've just noticed a confusion between correlation & independence implicit in the question: see [Why zero correlation does not necessarily imply independence](http://stats.stackexchange.com/q/179511/17230). Perhaps a statistician should be involved in writing these exams. Anyway, I can't see anything incorrect with either statement (perhaps substituting "correlated" or even "linearly correlated" for "not independent" & "associated"), & the latter is rather more informative. – Scortchi - Reinstate Monica Jul 28 '16 at 09:56

1 Answers1

2

Yes. Instead of using a two-sided critical value from a t-distribution with $n-2$ degrees of freedom (e.g., $\pm 2.09$ for $n=22$ and $\alpha=.05$, two-sided), you would use just the upper critical value (e.g., $+1.72$ for $n=22$ and $\alpha=.05$, one-sided).

Wolfgang
  • 15,542
  • 1
  • 47
  • 74
  • Thanks. Can you also please help me with what the conclusion would be if the data supports rejecting the null-hypothesis? Is it: "there is reason to believe that the variables are not correlated" (so rejecting $\rho=0$ ) or "there is reason to believe, that the variables are positively correlated" (so accepting $\rho>0$)? Since $H_0$ and $H_1$ are not negations of each other, these are different conclusions. As I understand, the conclusion should be the first one, but I would like to be sure. – Ferenc Beleznay Jul 27 '16 at 12:31
  • @FerencBeleznay: See [Why do statisticians say a non-significant result means “you can't reject the null” as opposed to accepting the null hypothesis?](http://stats.stackexchange.com/q/85903/17230). In this case though, you're accepting the alternative, which is what you've stipulated it to be: see [Is it possible to accept the alternative hypothesis?](http://stats.stackexchange.com/q/110348/17230). – Scortchi - Reinstate Monica Jul 27 '16 at 13:03
  • @FerencBeleznay The null hypothesis is $H_0: \rho \le 0$, so if the data support rejecting the null, you are rejecting the null hypothesis that the true correlation is 0 or negative (which in turn suggests that the true correlation is positive). – Wolfgang Jul 27 '16 at 18:17
  • @Wolfgang: Thanks, but the reason I ask the question is that the null-hypothesis is not $\rho\le 0$, it is the equality $\rho=0$. As far as I understand, the test statistic (which is the formula I mentioned in the question) only follows the t-distribution if $\rho=0$. This statistic is not following the t-distribution when $\rho<0$, so it cannot be used with other assumption then $\rho=0$. – Ferenc Beleznay Jul 28 '16 at 03:44
  • 1
    No, for the one-sided test, the null hypothesis is $\rho \le 0$. Obviously, if I can reject $\rho = 0$, then I can also reject $\rho = -0.5$ or $\rho = -1$, so we still use $\rho = 0$ as the null. – Wolfgang Jul 28 '16 at 06:47
  • 1
    @FerencBeleznay: Wolfgang's point is explained in more detail at [Justification of one-tailed hypothesis testing](http://stats.stackexchange.com/q/7853/17230). (Though I feel you're quite entitled to decide between $\rho=0$ & $\rho \leq 0$ as the null depending on the situation.) – Scortchi - Reinstate Monica Jul 28 '16 at 08:50
  • Sorry, but this 'obviously' is exactly the point why I am asking the question. It is not at all obvious to me, since the distribution of the statistic used is the t-distribution only if $\rho=0$, so I cannot see how to even start the test with $\rho=-0.5$, and if I cannot start the test, I cannot reject it. If you could point me to a resource that shows the background of this test, including the proof, that the statistic indeed follow the t-distribution, I would appreciate it. All I could find is the statement and some generalities on how to use it, but not the reason why it works. – Ferenc Beleznay Jul 28 '16 at 08:54
  • @Scortchi Thanks for that link, which explains this issue very clearly. – Wolfgang Jul 28 '16 at 08:56
  • @FerencBeleznay Does the link posted by Scortchi clarify things? – Wolfgang Jul 28 '16 at 08:58
  • @Scortchi: I read that post. I see how it applies for testing for the mean, but not how it applies for testing for the correlation. When testing for the mean, we can test for a lot of means, so we can set up a composite hypothesis. But as I explained in my previous post, the starting assumption of testing for correlation needs to be $\rho=0$ (to guarantee the distribution of the statistic used), I don't see a way of adding any other hypothesis to make it a composite. – Ferenc Beleznay Jul 28 '16 at 09:01
  • Ah! I think I see what you mean. The exact distribution of the sample correlation when $\rho=0$ has a considerably simpler form than the general one: https://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient#Using_the_exact_distribution. Nevertheless it suffices to show that any test rejecting $H_0: \rho=0$ will *also* reject $H_0: \rho=\rho_0$ for all $-1 – Scortchi - Reinstate Monica Jul 28 '16 at 09:36
  • Yes, indeed, it would suffice, but I don't know how to do this. First of all I would need a statistic to use in the $\rho=\rho_0\ne 0$ case, which itself looks to be tricky. Thanks for the wikipedia article, I will check out some references. – Ferenc Beleznay Jul 28 '16 at 10:00
  • Yes, it does look rather daunting. The requirement is that reducing $\rho$ from 0 increases the cumulative distribution function for all $r$. A function for the cumulative distribution function is available in the R library `SuppDists` & so the code `library(SuppDists); r – Scortchi - Reinstate Monica Jul 28 '16 at 10:54