2

When would I want to assign cluster probabilities to patterns instead of hard assignments to clusters? Can someone elaborate?

Nick Cox
  • 48,377
  • 8
  • 110
  • 156
Dahlai
  • 277
  • 2
  • 11

1 Answers1

5

The prototypical cases would be situations in which there is good reason to believe that there are clusters, but there isn't any clear separation between them. In cases like that, the reality of the situation is that there will be uncertainty about your cluster assignments so it is ideal to use an approach that reflects that. Using a finite Gaussian mixture model (note that the EM algorithm is just the way you estimate the GMM, it isn't the clustering model itself) is one way to respect that fact about your situation. (For what it's worth, there are others, such as fuzzy k-means.)

For some concrete examples of situations like this, it may help to read some of my answers that have discussed / demonstrated this:

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
  • I asked a similar question [here] (https://stats.stackexchange.com/questions/372477/comparing-k-means-and-expectation-maximization-on-the-dataset-generated-does-k) but is linked to cluster quality. Could you help me with this? – Suhail Gupta Oct 18 '18 at 05:43