I have only categorical variables in my database.
What distance/similarity to use?
I´m using the function simil() (library(proxy) in R.
I have only categorical variables in my database.
What distance/similarity to use?
I´m using the function simil() (library(proxy) in R.
You could try converting your categorical variables into sets of dummy variables and then use the Jaccard index as the distance measure.
There is a more detailed explanation here: What is the optimal distance function for individuals when attributes are nominal?