I have corpora of classified text. From these I create vectors. Each vector corresponds to one document. Vector components are word weights in this document computed as TFIDF values. Next I build a model in which every class is presented by a single vector. Model has as many vectors as there classes in the corpora. Component of a model vector is computed as mean of all component values taken from vectors in this class. For unclassified vectors I determine similarity with a model vector by computing cosine between these vectors.
Question: Can I use Euclidean Distance between unclassified and model vector to compute their similarity? If not - why?
Thanks!