Could decision tree classification be used to identify latent classes in data?

Question

What are the differences in statistical assumptions between a decision tree classification and a LCA classification? Could a decision tree be deployed to do the same work as LCA - why not?

It seems to me that the DT is a lot more explainable, and that what a class constitutes is a lot more intuitive. However, within my field of social science LCA seems to be a lot more prevalent as a method. I'm assuming that a decision tree cannot be said to be able to replace LCA because of this. However, I am unsure as to why this is?

Tim · Accepted Answer · 2021-06-27T14:41:09.710

1

Latent class analysis is a clustering algorithm. It’s main purpose is to find clusters in the data (latent classes). Decision tree is a classification algorithm. It doesn’t assume that the data is clustered, but it implicitly assumes data coming from a homogenous distribution.

In classification you know the labels and want the algorithm to learn how to recreate them. In clustering you don’t know the labels, the algorithm finds a useful grouping for the data. I never heard of “LCA classification”, but I guess someone could use the name for latent class regression for logistic regression, but even in such case, it still does the clustering regardless of the labels.

Decision tree may be more interpretable if you want to explain the classification decision. On another hand, if you need to explain what are the clusters in the data, decision tree does nothing related to solving this problem.

edited Jun 27 '21 at 14:41

answered Jun 27 '21 at 12:36

Tim

108,699
20
212
390

Thank you for a well put answer! LCAs ability to "sort out" different distributions seems like a notable difference. The difference between starting out with a known target or not also seems to divide the two. By LCA classification I was merely referring to those 'latent classes' LCA identifies in data. However, aren't DTs able to identify clusters? It seems to me that this is at times what they are used for? – fmfrisch Jun 27 '21 at 15:31
1

@fmfrisch no, decision tree is a classification algorithm, it doesn’t do anything else. You might have confused it with hierarchical clustering, but it’s a completely different algorithm https://en.wikipedia.org/wiki/Hierarchical_clustering – Tim Jun 27 '21 at 15:59

Could decision tree classification be used to identify latent classes in data?

1 Answers1