3

I am looking for a stability measure/index for Fuzzy C Means Clustering in R. Can anyone direct me to a package or a code in R? I am not looking for internal indices, only stability measures.

Kitty123
  • 35
  • 4

1 Answers1

1

I am not so familiar with fuzzy clustering, going through the literature it seems like Dunn’s partition coefficient is often used, and in the implementation in cluster for another similar fuzzy cluster algorithm fanny, it writes

coeff: Dunn’s partition coefficient F(k) of the clustering, where k is the number of clusters. F(k) is the sum of all squared membership coefficients, divided by the number of observations. Its value is between 1/k and 1. The normalized form of the coefficient is also given. It is defined as (F(k) − 1/k)/(1−1/k), and ranges between 0 and 1. A low value of Dunn’s coefficient indicates a very fuzzy clustering, whereas a value close to 1 indicates a nearcrisp clustering

We can try that using iris with cmeans() from e1071 and also compare it with fanny():

Fn = function(c1){
n = nrow(c1$membership)
k = length(unique(clus$cluster))
nu = sum(c1$membership^2)/n
c(nu,(k*nu - 1)/(k-1))
}

library(e1071)
library(cluster)
library(mclust)

data <-scale(iris[,-5])

dunn_fanny = sapply(2:6,function(i)Fn(fanny(data,i,maxit=1000)))
dunn_cmeans = sapply(2:6,function(i)Fn(cmeans(data,i)))

plot(2:6,dunn_fanny[2,],"l",ylim=c(0,1),col="blue",
ylab="Normalized partition coefficient",xlab="no of clusters")
lines(2:6,dunn_cmeans[2,],"l",ylim=c(0,1),col="brown")
legend("topright",fill=c("blue","brown"),c("fanny","cmeans"))

enter image description here

Really depends on your data, whether this works or not.. And my guess is like the same for all clustering methods, there's no good way to determine number of clusters. I think for comparing methods, it might be ok.

StupidWolf
  • 4,494
  • 3
  • 10
  • 26