3

Suppose you want to find clusters based on a set of variables $Y$, and that you want to estimate the effects of some variables $X$ on membership in those clusters. Here is how I am doing it now.

Step 1: Perform model-based clustering on the variables $Y$ (using the mclust package for this).

Step 2: Optimize a multinomial regression model with cluster membership as the outcome variable.

It seems like there must be a better way in which the models are estimated simultaneously. Anyone know a good tool in R for this and, even better, a good set of references for (a) the statistical model that the package implements, and (b) how to use the package?

Thanks

Brash Equilibrium
  • 3,565
  • 1
  • 25
  • 43

1 Answers1

0

Uncharacteristically, I am answering my own question! I found a package in R called poLCA, which is capable of handling polytomous/categorical/factor "manifest" variables (the outcome variables used to define the latent classes) and "concomitant" variables, which have effects on the probability of class membership. The package was developed by Linzer and Lewis (2011). Oh, and to help you avoid the web search, yes, it can handle ordinal manifest variables. Happy birthday to me.

References

Linzer, DA, & Lewis, JB. (2011). poLCA: An R package for polytomous variable latent class analysis. Journal of Statistical Software 42(10):1-29.

Brash Equilibrium
  • 3,565
  • 1
  • 25
  • 43