Background
I have data from surveys (on political views from CSES) with answers from respondents in ranking-scales, either 0:10 (0, 1, 2, ..., 10) or 0:3 (0, 1, 2, 3). I want to analyze this data using hierarchical clustering analysis.
To do this I need to scale these features so that they contribute equally to the distance measure. I do not want to weight any feature more relative to the others at the moment, i.e. they should count equally and so need to be scaled.
For similarity measure, I am planning to use Euclidean distance. I am not sure it is appropriate when using ordinal features as in this case. I know that it is also possible to measure similarity with correlation-based measures, but that does not seem appropriate in this case either.
Questions
- Is standardization (scaling to mean of 0 and standard deviation of 1) appropriate for scaling the features in this scale, or should I use some other method to scale it?
- Are there any problems with using Euclidean distance as a similarity measure that I should be aware of in this case?
Additional info
Some examples of the survey data (the exact formulation used for the questions were different, I do not have access to that at the moment):
- On a scale 0 to 10 do you dislike or like party X?
- On a scale 0 to 10 do you dislike or like party Y?
- On a scale 0 to 10 where would be on the left-right
- On a scale 0 to 3 do you feel close to a particular party.
Thanks! Have a nice day.