0

I am analyzing rainfall water quality with cluster analysis first and then principal component analysis (PCA). There are 7 variables. Results from cluster analysis that extracting with PCA, almost of them show negative determinant, KMO very low (less than 0.3) and according to Barnett's test on sphericity, those data truly not significant (only 10% of group samples have p value <0.05). Then, I checked on the correlation matrix and found that some of them are very highly correlated and some very low. To raise good KMO and more significance, I dropped the least variable, but the results are not very different. Although KMO became little higher, the significance test showed p value over 0.400.

  1. Can I use these data to get factors?
  2. What I must do with results like that?
Nick Cox
  • 48,377
  • 8
  • 110
  • 156
Lulu Marjan
  • 1
  • 1
  • 1

1 Answers1

1

I am not sure I understand precisely what you are trying to do (What's the role of cluster analysis? Are you trying to derive factors/latent variables or use PCA for data reduction?) but all this (determinant, KMO) does not sound promising and is certainly far outside of traditional guidelines from the 1950s. On the other hand, my understanding is that these diagnostics were mostly a way to guess if factor analysis would be worthwhile before actually trying it, at a time when it was tedious and expensive to do so.

With a modern computer, it's just a matter of pressing a few buttons and waiting a few seconds so you might as well just run it and see what comes out. Chances are you will find many factors defined by single variables, many low loadings or a generally uninterpretable structure or you will perhaps not be able to factor the matrix at all. In that case you might as well work with the original variables.

Gala
  • 8,323
  • 2
  • 28
  • 42