3

How can I measure linear correlation of non-normally distributed variables? Pearson coefficient is not valid for non-normally distributed data, and Spearman's rho does not capture linear correlation.

Thank you

chl
  • 50,972
  • 18
  • 205
  • 364
Julian
  • 181
  • 3
  • 5
  • 5
    Linear correlation is defined without regard to the underlying distribution. I do not know what you mean by "Pearson is not valid for non-normally distributed data." Perhaps you are thinking of hypothesis testing? – charles.y.zheng Mar 09 '11 at 09:45
  • 2
    This question is an extremely close relative of http://stats.stackexchange.com/questions/3730/pearsons-or-spearmans-correlation-with-non-normal-data – whuber Mar 09 '11 at 17:09

3 Answers3

3

Why do you require normality for computing a correlation? How about a simple scatterplot? As long as the data are continuous, ordinary (Pearson) correlation should be fine. All that it is measuring is the strength of the linear relationship between two variables (if indeed there is such a relationship).

Galit Shmueli
  • 1,090
  • 8
  • 10
3

In addition to Anscombe's quartet as mentioned by Peter Flom, here is a very nice paper in the risk-management context illustrating the problems of using linear correlation with non-normally distributed variables. In a nutshell, much of our intuition about how correlation behaves -- all values of $\rho \in [-1, 1]$ are possible; an exact monotonic relationship implies $|\rho | = 1$; $\rho = 0$ implies independence; etc, doesn't necessarily apply in the case of non-normality.

Hong Ooi
  • 7,629
  • 3
  • 29
  • 52
  • 2
    @Julian, while you're reading, be sure to also check out [Mikosch's critique](http://www.math.ku.dk/~mikosch/Preprint/Copula/s.pdf) of copulas. The article was published with several discussion responses, many of which are quite interesting in their own right. – cardinal Mar 10 '11 at 14:51
2

What @galit said is absolutely right. You can find the linear correlation between any two continuously distributed variables.

But perhaps you are thinking of the meaning of such a correlation? Indeed, Anscombe's quartet shows that while the correlation is defined for any pair of continuous variables, and its mathematical and statistical meaning is the same, its substantive meaningfulness may vary.

Peter Flom
  • 94,055
  • 35
  • 143
  • 276