Why are K-means and GMM (Gaussian Mixture Models) not suitable for discovering clusters with non-convexs shapes?

Question

I have seen that mainly here and from a lot of resources that K-means and Hello all!

Gaussian mixtures are not suitable for detecting clusters with non-convex shapes. I know that because both methods have an assumption about clusters being spherical(convex type).

I am more curious about underlying reason about this inability? What does it mathematically mean that assuming clusters being convex? Exactly what step in those methods/algorithms make themselves restrict to discover non-convex shapes while others (for example DBSCAN) fulfills this task?

TL,DR: What mechanism of K-means or GMM algorithms restrict themselves away from discovering non-convex shapes ?

Thanks in advance!

Mickybo Yakari · Answer 1 · 2019-12-26T16:22:09.050

Let $\mathbb{R}^p$ be our ambient space. It is assumed that we use the Euclidean distance.

A convex set $S \subseteq \mathbb{R}^p$ is one that contains every line segment that joins two of its element.

Let $p_1,...,p_k$ be a set of points in $\mathbb{R}^p$ and let $r \neq s \in \{1,...,k\}$.

What is the equation of the set of points in $\mathbb{R}^p$ that are equidistant from $a=p_r$ and $b=p_s$?

These are the points $(x_1,...,x_p)$ such that $$ \sum_{i=1}^{p} (x_i-a_i)^2 = \sum_{i=1}^{p} (x_i-b_i)^2. $$

Now, if you expand both sides and substract $\sum_{i=1}^{p} {x_i}^2$ from them, you are left with the equation of a hyperplane: $$2 \sum_{i=1}^{p} (a_i-b_i)x_i + \sum_{i=1}^{p} (a_i^2-b_i^2) = 0.$$

The points that are at least as close to $a$ than they are to $b$ are exactly those that satisfy $$ 2 \sum_{i=1}^{p} (a_i-b_i)x_i + \sum_{i=1}^{p} (a_i^2-b_i^2) \geq 0.$$

A set obtained by replacing the equality of the equation of a hyperplane by an inequality is called a half-space.
Half-spaces are convex sets.
The intersection of two convex sets is convex.
We apply the same reasoning to every pair of elements of $\{p_1,...,p_k\}$ and notice that every iteration of the algorithm necessarily leaves us with a subdivision of the space into convex subsets.

Why are K-means and GMM (Gaussian Mixture Models) not suitable for discovering clusters with non-convexs shapes?

1 Answers1