Questions tagged [k-medoids]

61 questions
13
votes
3 answers

An example where the output of the k-medoid algorithm is different than the output of the k-means algorithm

I understand the difference between k medoid and k means. But can you give me an example with a small data set where the k medoid output is different from k means output.
tubby
  • 593
  • 2
  • 6
  • 9
9
votes
2 answers

What is the benefit of using Manhattan distance for K-medoid than using Euclidean distance?

Please give me the reasons. I didn't find any k-medoid example that's calculation is done using Euclidean distance. All examples are made of Manhattan distance for k-medoid.
MD MOHIBULLA
  • 103
  • 1
  • 1
  • 3
9
votes
1 answer

Difference between K-medoids and PAM

I understood that PAM is just one kind of K-medoids algorithm. The difference is in new medoid selection (per iteration): K-medoids selects object that is closest to the medoid as a next medoid PAM tries out all of the objects in the cluster as a…
Kobe-Wan Kenobi
  • 2,437
  • 3
  • 20
  • 33
5
votes
3 answers

How to perform K-medoids when having the distance matrix

I've been trying for a long time to figure out how to perform (on paper)the K-medoids algorithm, however I'm not able to understand how to begin and iterate. for example: I have the distance matrix between 6 points, the k,C1 and C2. I'll be very…
John
  • 59
  • 1
  • 1
  • 2
5
votes
2 answers

Partitioning Around Medoids (PAM) with Gower distance matrix

My data is is mostly continuous but has one binary variable. I tried the pam algorithm in R with the Gower index, but the number of clusters that give the best silhouette width is 2 – allowing the binary variable to completely dominate the result.…
Jin
  • 51
  • 2
5
votes
1 answer

How to derive the time computational complexity of k-medoids (PAM) clustering algorithm?

I have read that the time complexity of k-medoids/Partitioning Around Medoids (PAM) is O(k(n-k)^2). I am trying to understand how this algorithms translates into this time complexity. As per my assumption, we have to find the distance between each…
5
votes
1 answer

Can silhouette be calculated with distances to centroids, instead of pairwise point distances?

I am using Silhouette cluster validation for each repetition (for a specific K) of k-means, k-modes and k-medoids. All the definitions of Silhouette I see calculate the distance of each point to others points within the same cluster, then compare it…
Fabio
  • 270
  • 2
  • 8
4
votes
0 answers

Using Davies-Bouldin index in clustering

I am clustering data using k-medoid. I used Davies–Bouldin index for $2$ to $n-1$ clusters. Here $n = 100$ (using smaller test case). I find minimal value of the index for 98 clusters. But the overall accuracy rate for 98 cluster is very small…
Diptopol Dam
  • 185
  • 1
  • 7
4
votes
1 answer

Why use K-medoids for sequence analysis?

In the package WeightedCluster there seems to be facilities for using K-medoids clustering (i.e. wcKMedoids()), but not the more common K-means. Some traditional recommendations of clustering specify that first one should determine the number of…
histelheim
  • 2,465
  • 4
  • 23
  • 40
4
votes
0 answers

k-means clustering why sum of squared errors (why k-medoids not)?

K-means clustering uses the sum of squared errors (SSE) $E = \sum\limits_{i=1}^k \sum\limits_{p \in C_i} (p-m_i)^2$ (with k clusters, C the set of objects in a cluster, m the center point of a cluster) after each iteration to check if SSE is…
dominic
  • 41
  • 1
  • 1
  • 3
4
votes
1 answer

Cluster analysis without knowing the structure of the data set

I’m working on a task regarding cluster analysis for about half a year now, but since the fields of pattern recognition and cluster analysis are quite complex ones, I would call myself a beginner in this subject. I’m trying to cluster some…
Leonard
  • 41
  • 2
4
votes
3 answers

Clustering based on large Jensen-Shannon Divergence distance matrix

I have a dataset with large number of features and about 15 000 observations. I’m using a probability distribution distance metric related to Jensen-Shannon divergence (JSD) to cluster the observations calculated as described in…
Andres Kull
  • 151
  • 1
  • 7
3
votes
1 answer

Log-likelihood distance measure validity for clustering

I have calculated log-likelihood distances between 50 sequences according to the Formula (1): $$ D(X_i,X_j)= 1/2(\log p(X_i|Mod_j)+\log p(X_j|Mod_i)), $$ where $ p(X_i|Mod_j) $ is the likelihood of sequence $X_i$ being produced by model $Mod_j$,…
zima
  • 739
  • 3
  • 7
  • 19
3
votes
1 answer

Partitioning Around Medoids

I have a question regarding Partitioning Around Medoids (PAM) clustering algorithm, because everywhere I look, it is described differently. In every step of the algorithms do I swap only one medoid or more? I mean, does the swapping step look…
user1315305
  • 1,199
  • 4
  • 14
  • 15
3
votes
0 answers

When to use K-Medoids instead of K-means

When it's better to use K-Medoids rather than K-Means? Can anybody give some examples of dataset for the same?
1
2 3 4 5