0

In which case we should pickup Manhattan distance and when we should use euclidian distance measure.

To my understanding both are used for continues numeric data(not like cosine or others who works on different similarities).

So to which data, which method works well? I have seen applying KNN with Manhattan working well with data containing lot of 0/1 classes(though not survey data) and some other continues fields.

Any guidelines on when which works better and why?

ttnphns
  • 51,648
  • 40
  • 253
  • 462
SKB
  • 155
  • 7
  • Does this answer your question? [Euclidean distance is usually not good for sparse data?](https://stats.stackexchange.com/questions/29627/euclidean-distance-is-usually-not-good-for-sparse-data) – Todd Burus Jan 05 '20 at 05:39
  • For binary data, _squared_ euclidean d = Manhattan d and is also called Hamming d. This d is therefore _metric_ (as any Manhattan d is). While for quantitative data, squared euclidean d isn't metric and is not equal to Manhattan. – ttnphns Jan 05 '20 at 06:23
  • Note also that _sq. root_ of Euclidean or of Manhattan d for scale as well as binary data is not only metric, it is also geometrically Euclidean (converges in Euclidean space), which can be convenient for techniques mapping in Euclidean space. – ttnphns Jan 05 '20 at 06:38
  • For any kind of data, cosine similarity and (squared) euclidean d are precisely related, so if you can compute one you are always able to convert it into the other. – ttnphns Jan 05 '20 at 06:39

0 Answers0