I'm trying to calculate correlation using a formula in Statistics 4th Edition by Freedman:
r = average of (x in standard units) * (y in standard units)
If I try this out ...
x = 1:7
y = c(6,7,5,4,3,1,2)
x.z = scale(x)
y.z = scale(y)
prod = x.z * y.z
mean(prod)
[1] -0.7959184
However, if I use the builtin cor
I get a different answer:
cor(x, y)
[1] -0.9285714
Looking through the worked examples in the book, the standard values for x and y seem to be rounded to the nearest 0.5, so I round my values and I get the expected answer:
x.z.round = round(x.z/0.5)*0.5
y.z.round = round(y.z/0.5)*0.5
prod.round = x.z.round * y.z.round
mean(prod.round)
[1] -0.9285714
Why do the x and y scaled values seemingly need to be rounded to the nearest 0.5?