I'm using external validation to measure the success of a clustering algorithm. I don't consider my categories to be definite, so I'm looking for a measure that is forgiving to the following extent:
- If two clusters are merged into one, then this shouldn't be unduly penalised as it is still a good match (but still to be penalised to some extent to prevent the algorithm being pushed towards generating huge clusters)
- If one cluster is split, into two, then this shouldn't be unduly penalised
- Suppose there are two clusters, A and B. Suppose half of A is put into A2, half of B into B2 and the other half of A and B are combined into C. This kind of alternative categorisation should be penalised, more so than the first two occurrences, but not unduly as it could quite possibly represent another possible valid classification
What is a good measure for this kind of cluster validation?
This is related to my previous question