5

I am hoping to implement an unsupervised technique that identifies distinct clusters of individuals based on longitudinal data: 100 continuous or categorical variables measured at different ages.

A lot of the functionality provided by R packages seems to have been developed for simpler cases (eg with just one variable measured at different time points), so I was wondering what the best way to approach such a problem with R might be and which techniques (eg Latent Class Modelling) are considered to perform best.

Guest333
  • 161
  • 1
  • 1
  • 5
  • I have used Latent Transition Analysis with discretized continuous variables and other categoricals. Not sure if R supports it, but you can do it in SAS. In terms of unsupervised...I assume you want to use this for data mining or something? – ReliableResearch May 13 '13 at 19:35
  • @toomuchpj The [poLCA](http://dlinzer.github.io/poLCA/) R package offers some facilities for Latent Class Analysis as well. – chl May 13 '13 at 21:16
  • Yes, I do want to use it for data mining and identifying distinct subgroups (eg start with high values of x1, ..., x50 at early ages, then increases in x51, ..., x100 etc.) I will have a look at poLCA - any other tecniques that could be relevant? – Guest333 May 14 '13 at 09:59
  • 1
    Possible duplicate https://stats.stackexchange.com/questions/13442/how-to-find-groupings-trajectories-among-longitudinal-data – radek Jan 24 '19 at 02:00

1 Answers1

1

If you have a particular longitudinal variable you are interested in, you could take an unsupervised approach on the covariates using either a mixed-effects regression tree or latent growth curve structural equation modeling tree. For SEM trees, see this for more info: http://brandmaier.de/semtree/user-guide/

dmartin
  • 3,010
  • 3
  • 22
  • 27