1

Let's say we have a vector X = (1, 1, 1) and Y = c(2, 2, 2). I want to calculate correlation between those. Taking Pearson correlation brings us nowhere, because standard deviation of both of these equals 0 so we divide by 0. It means that Pearson correlation suggests us that correlation is undefined. However let's use geometric approach which says that correlation coefficient can be calculated as :

enter image description here

So by substituting our two vectors we get :

$$r = \frac{1\cdot 2 + 1 \cdot 2 + 1 \cdot 2}{\sqrt{1^2+1^2+1^2}\cdot\sqrt{2^2+2^2+2^2}} = \frac{6}{\sqrt{3}\cdot \sqrt{12}} = \frac{6}{\sqrt{36}}=1$$

Why these approaches gives different results, while they are equivalent ? Which result is accurate ?

John
  • 279
  • 1
  • 7
  • [Correlation with a constant](https://stats.stackexchange.com/questions/267152/correlation-with-a-constant) – user2974951 Jan 22 '21 at 11:55
  • [Pearson correlation of data sets with possibly zero standard deviation?](https://stats.stackexchange.com/questions/9068/pearson-correlation-of-data-sets-with-possibly-zero-standard-deviation?rq=1) – user2974951 Jan 22 '21 at 11:55
  • [What is the correlation if the standard deviation of one variable is 0?](https://stats.stackexchange.com/questions/18333/what-is-the-correlation-if-the-standard-deviation-of-one-variable-is-0?rq=1) – user2974951 Jan 22 '21 at 11:56

1 Answers1

0

Thanks to @user2974951 who points at very good threads on the topics. There is also this one : Is there any relationship among cosine similarity, pearson correlation, and z-score?, which shows that Pearson correlation is cosine similarity between centered vectors, which $X$ and $Y$ in your example are not. If you center them (i.e., substract their means), you'll have $r=\text{cos}(\theta)=0$.

POC
  • 346
  • 1
  • 8
  • 23