K-Means Clustering using modified correlation (1 - pearson correlation coefficient)

Asked Jan 03 '16 at 01:27

Active Sep 27 '18 at 09:39

Viewed 1,386 times

I am trying to implement k-means clustering on a 6x6 data set that looks like this:

2 3 6 0 1 7
4 9 9 6 2 2
0 1 7 9 5 0
2 3 2 7 8 3
8 2 9 2 3 1
8 0 0 1 7 9

Using rows 2 and 4 as the centroids:

4 9 9 6 2 2
2 3 2 7 8 3

Taking the firsts row of the dataset and the first centroid, I can calculate the Euclidean distance like so:

$ \sqrt ((4-2)^2 + (9-3)^2 + (9-6)^2 + (6-0)^2 + (2-1)^2 + (2-7)^2)$

I want to now run the algorithm using Modified Correlation instead of Euclidean Distance, defined as

$mc = 1 - r$, where $r$ is the Pearson Correlation Coefficient.

So how does this work? I have never really worked with covariance / standard deviation in more than 2-d space. Can somebody give me a quick runthrough on the first row like I did above (or point me in the right direction)? I can't seem to find documentation on how I can calculate this for a 6-d data set.

edited Sep 27 '18 at 09:39

kjetil b halvorsen

63,378
26
142
467

asked Jan 03 '16 at 01:27

R.S

2

Possible duplicate of [Why does k-means clustering algorithm use only Euclidean distance metric?](http://stats.stackexchange.com/questions/81481/why-does-k-means-clustering-algorithm-use-only-euclidean-distance-metric) – gung - Reinstate Monica Nov 19 '16 at 18:45

K-Means Clustering using modified correlation (1 - pearson correlation coefficient)

0 Answers0