I am doing a study were I have to initially do an exploratory analysis by grouping diseases that coexist in my study population. For that I have x diseases/variables that are (true) dichotomous (presence or absence of the diagnosis).
The purpose of this first exploratory analysis is to then study all the possible combinations within each group (and so reduce the inicial n^x possible combinations), considering the real coexistence of the diseases in the population (and not reduce the dimensions, or create any score).
For this I thought in 3 methods:
- PCA with a phi correlation matrix (given that the variables are true dichotomous);
- PCA with a tetrachoric correlation matrix (although the variables are true dichotomous (and not artificial), I tried this method after reading some suggestions here);
- LTM (have not tried yet, because I never used this, but thought of it also after reading some suggestions, although I am not sure if this is the right approach).
The grouping results that made more (biological) sense was PCA with phi correlation (in this initial test x=12, and components were selected when factor loadings >0,40, after varimax rotation). The PCA with tetrachoric correlation made less sense (with same criteria), and the number of principal components that made more sense was too low (with many variables per component, and so many possible combinations within each component, not helping so much in my next analysis).
Given my goals for this exploratory analysis, what is my best approach (one mentioned, or other)?
Note: PCA: Principal Component Analysis; LTM: Latent trait model.