I am working on a research project such that I need to compare several distance based classifiers - say TF-IDF based KNN and Kmeans for clustering. Suppose I use Cosine Similarity for one and Cosine Distance for the other -- how will it affect the evaluations?
Asked
Active
Viewed 43 times
1
-
If by cosine distance you mean `1-cos` or `sqrt(1-cos)` then it is directly tied with cosine similarity, [being the euclidean distance for the unit-normalized vectors](http://stats.stackexchange.com/a/36158/3277). Therefore informationally both are equivalent. But they are different form - angular measure vs linear distance. Choose what you think is more appropriate for you classification algorithm or interpretation. – ttnphns Jul 01 '16 at 11:31
1 Answers
1
If you meant Euclidean distance by Cosine distance,it is susceptible to entities being clustered by their L2-norm (magnitude, in the 2-dimensional case) instead of direction. i.e., vectors with quite different directions would be clustered because their distances from the origin are similar.
for the effects you can try it out by yourself referring to the answers to this question