I have to run a SPSS two-step cluster analysis. All my 4 variables are continuous scalar standardized parameters (with normal distribution). The dataset includes 10,000 cases.
SPSS suggest to use euclidean distance with such a dataset, but the resuls are not significant (2 clusters: 99% and 1%), while using the log-likelihood distance option the clusters seem much more meaningful (both if I specify a fixed number of clusters and if I do not).
Question:
Which may be the reason of such a meaningless results with euclidean distance? maybe noise handling? And is it incorrect to use the log-likelihood distance even if my variables are all continuous?