Correlation between two binary variables

Question

Let's say I have some people that moved from country A to country B. Then I have a binary variable that indicates whether they read the news paper when they were in country A and one binary variable that indicates whether they read the news paper at the present, in country B. If I find that the correlation coefficient between the two variables is 0.368, how do I interpret this number? 2-tailed significance is 0.007. Is this number relevant?

For us to know whether it is relevant we need to know what your scientific question is. — mdewey, Jul 09 '16 at 16:01

score 3 · Answer 1 · answered Jul 09 '16 at 15:35

3

The Pearson correlation is a poor choice of metric for comparing two binary variables. There are many ways to slice and dice this kind of data, but one of the simplest and nicest is to calculate proportion agreement (or in the language of classification, accuracy). That means counting the proportion of pairs for which the values are equal.

answered Jul 09 '16 at 15:35

Kodiologist

19,063
2
36
68

Thanks I tried that right now, (Measure of Agreement Kappa). I got a value of 0.419. However when I look at the values I see that I have 33 out of 46 people either answered (yes, yes) or (no, no). That seems to be a much higher proportion of pairs where values are equal. I guess I'm missing something here. – TruckGuy Jul 09 '16 at 15:43
Cohen's $\kappa$ gives you the proportion of agreement _corrected for chance agreement_ not the proportion of agreement. – mdewey Jul 09 '16 at 15:56
Why is the pearson correlation coeficient a poor choice? – Jens Wagemaker Apr 26 '21 at 12:35
@JensWagemaker It's needlessly opaque. – Kodiologist Apr 26 '21 at 14:42

Correlation between two binary variables

1 Answers1