I keep seeing that we can't use Pearson correlation for binary variables but I don't understand why. If instead of a binary response, I have multiple (>2) categories, then it's obvious - we are unsure about the ranking. However, let's just say we a binary gender (0- female, 1 - male) response and then some continuous variable (e.g. height)
B = matrix(
c(0,0,0,0,0,0,1,1,1,1,1,1,150,165,160,157,170,155,168,169,172,180,190,176),
nrow=12,ncol=2)
[,1] [,2]
[1,] 0 150
[2,] 0 165
[3,] 0 160
[4,] 0 157
[5,] 0 170
[6,] 0 155
[7,] 1 168
[8,] 1 169
[9,] 1 172
[10,] 1 180
[11,] 1 190
[12,] 1 176
cor(B[,1], B[,2])}
0.7564467
Pearson correlation yields me a strong positive correlation of 0.76 - it seems only logical as men are taller in this sample. So why can't we use Pearson for variable preselection?