I have a set of dummy variables (~300) indicating a particular feature, and rows which represent an individual.
I plot this data after using nMDS to visualize which individuals are more similar to other individuals. And this seems to work as I expect.
However, now I want to quantitatively explain why the space is the way it is. I.e. why aren't the points randomly distributed across the space. As I understand from this post, I shouldn't try to model the nMDS axis (although I could possible use the dimensions to explain a response?)
I have thought that perhaps I could perform a clustering analysis on the outputs, and then model the classifications using some modelling techniques (random forest, categorical linear models or whatever). Does this seem appropriate? Are there other methods that might be more appropriate for this?