I am trying to do anomaly detection on a heterogeneous dataset (There are unknown groups present in the dataset). I want to try multivariate Gaussian distribution based approach, but I was thinking of the following problem:
Should I try to use a single multivariate Gaussian distribution for the entire dataset or should I try to cluster the dataset first and for each of the clusters, I should use a different multivariate Gaussian distribution? My intuition tells me to do the latter, but I am a bit hesitant to use K-Means clustering (My dataset has millions of records, but few features < 100).
Would you kindly advise?