exploring where and how clusters differ

Question

After running clustering, I'd like to explore the results and hopefully get some sort of concrete idea of where (i.e. which on which of my n variables) differences arise between clusters.

What if I run n one-way ANOVAs with cluster as my IV and each of n variables as my DV? I am aware that this is both double dipping (the clustering separated the most distinct data points so there will definitely be differences) and would lead to inflated p values (from performing n tests). Yet the only reason why I am doing this is as a post-hoc exploration, so I was wondering if it would be valid simply to give an indication of where those differences are?

If not then what alternative might there be to learn more about the clustering output?

That's a great question. If you're worried about the validity of p-values, it may be useful to just graph the data to see where the differences are. There is often no need to test a hypothesis if you don't have one, exploratory data analysis (including visualization) is a totally valid if you're doing just that--exploring the data! — chelseaparlett, Jul 10 '20 at 17:45
Well I think this is what I'm a bit stuck with, because I have so many variables and I feel like I need some systematic way to identify where those differences are. I was thinking of using the hypothesis testing to guide me to where they are. I wasn't sure if this is a valid approach though — fffrost, Jul 10 '20 at 19:47
This question was asked many times before. Since clusters are groups, any comparison procedure appropriate for the data could be used to assess the differences between clusters on this or that particular variable. You don't need p-values, rather, some measure of effect size (the difference). One of popular [internal clustering criteria](https://stats.stackexchange.com/a/358937/3277), Ratkowsky–Lance, is easily computed for each variable (univariately), besides its omnibus version, and these partial values of it show which if the variables contribute more and which less. — ttnphns, Jul 10 '20 at 19:47

exploring where and how clusters differ

0 Answers0