Is there a clustering algorithm that maximizes average silhouette? If not, why not?

Question

The average silhouette index of a clustering of data seems to be used to determine the hyper parameter $k$ (i.e. number of clusters) in clustering algorithms like $k$-means. Is there an algorithm that instead takes $k$ to be fixed and maximizes the average silhouette index?

If there is not then is there an explanation for why this would not be a good algorithm? Is it too hard? OR does it produce bad clusters?

I have found this paper: Clustering Categorical Data Using Silhouette Coefficient as a Relocating Measure

But I am interested in optimal algorithms or approximations not heuristics.

score 1 · Answer 1 · answered Oct 04 '18 at 11:29

It's not very scalable - computing the Silhouette takes O(n²) time, so doing this within a clustering algorithm will quickly yield an O(n³) or worse algorithm.

You apparently can modify PAM to optimize Silhouette:

van der Laan, M. J., Pollard, K. S., & Bryan, J. (2003). A new partitioning around medoids algorithm. Journal of Statistical Computation and Simulation, 73(8), 575–584

But the authors also suggest to use an approximation that is faster to compute, and only used tiny data sets. I am not sure if the claim that it is "as fast as PAM" actually holds, because PAM includes a clever trick to only compute the change in TD, and avoid recomputing TD repeatedly - which would probably add a factor of O(n). It is not obvious that such a delta-cost approach is possible even with the simplified Silhouette...

Nevertheless, I'd be interesting in having an implementation of this contributed to ELKI for experimentation... even if it is just to know that the approach scales too bad to be useful for today's data (and maybe I'm wrong with above impression, and it is "just" as bad as PAM).

ttnphns · Answer 2 · 2018-10-04T12:14:41.993

I did not read the paper that you link to, but I've noticed the option (and been using it) to relocate objects based on Silhouette statistic, to improve clustering results, for long. There is collection "Clustering criterions" on my web-page with the document about internal clustering criteria where I wrote for Silhouette index:

Using silhouette value for improvement of a clustering. That for each object the closest to it other cluster is known opens an opportunity for improving a classification. Object(s) having Silhouette statistic too low comparatively with other objects of its cluster can be reassigned by the user to the cluster closest to that object (its code is output by the macro), then rerun the macro to see if that transfer enhanced the object’s Silhouette statistic and the average Silhouette statistic of the entire cluster solution. One might do that transfer several times with the same or different objects (there comes out a kind of a relocating cluster algorithm aiming to improve classification without altering the number of clusters). Average Silhouette statistic ceases to improve soon, but some while its variance can still be decreasing, i.e. thickness of some cluster silhouettes be increasing.

Often but not always such post-clustering reassignment does improve the overall final Silhouette a little bit. One problem is that it is difficult to decide unquestionably which objects and how many objects at once should be moved from their clusters to their neighbourhood clusters. It is a combinatorial optimization task optimal solution of which would be time expensive. (So in this respect I agree with @Erich Schubert's answer.) Designing a clustering algorithm, even greedy one, that will work on the basis of such relocating maximizing Silhouette from the very start will be even more burdensome.

For the classical (original) Silhouette index which calculates average of distances, classic between-group average (UPGMA) hierarchical clustering is the most natural approximation or kinship; roughly speaking, this method tries to optimize cheaply something close to Silhouette statistic.

In a footnote to this answer and under the link there I mention that all clustering algorithms each aim to optimize some approximation to one of the many internal clustering criteria (Silhouette being just one of them). "Approximation" - because an exact optimization of a given criterion will often present a task very expensive to solve or don't know how to solve, nowadays.

Is there a clustering algorithm that maximizes average silhouette? If not, why not?

2 Answers2