How to define a distance function when euclidean distance doesn't apply? For instance, say I have some data involves nationality. I'll probably assign a number to each nation, but for nations that have smaller difference in numbers doesn't mean that they are more prone to be in the same cluster as nations that have bigger difference in numbers.
Is it make sense if I just define a function that return 0 if two nations are the same, and return some positive integer otherwise? If so, how big that positive integer should be?