For a multilabel dataset, i would like to find the number of clusters involved in it. The below example gives more details about the problem:
Label_A: feature values
Label_B: feature values
Label_A, Label_C: feature values
Label_C: feature values ... etc
We have say $n$ datarecord. Label field may have single label/multilabel(as in the case of record 3).
I would like to determine the number of cluster involved in the data. Assuming number of label as the number of cluster results in bad accuracy. This is because there may be case where single label can have multiple cluster. In this case, if we can find more cluster and assign two or more cluster to same label, we can increase the accuracy.
Hence, how do you find the number of cluster present in the multilabel data?