We have a dataset of subgroups of the bacteria E. coli. We have frequency data for four subgroups in 38 locations and we want to cluster these locations by using the frequencies of subgroups which occur in each.
Initially we used two clustering methods:
Euclidian distance using the dist{stats} function in R; and
clustering after FactoMineR, using metric="euclidean", and method="ward".
Following this we received feedback that "median clustering would be better due to non-parametric aspects of data". Does this make sense?