I am trying to perform clustering (planned to use K-means in R) on the data that contain both categorical and continuous variables. For example, my data contains 4 variables: gender (M and F), income (15000 - 70000 USD), employment period (in months), and education (Bachelor, Master, and PhD).
First, I recode the categorical variables to a set of flag, so gender will be represented by [1, 0] if male, and [0, 1] for female. This also applies to education. If bachelor, it will be [1,0,0].
Here is my question:
Since I have income variables ranged from 15000- 70000 USD and employment periods (in months), I should normalize these two variables.
What about the variables that I created (0-1 variables)? Do I have to normalize them ?