fast distance metric between a new data entry and available clusters of data

Asked Mar 04 '22 at 10:39

Active Mar 04 '22 at 10:39

Viewed 9 times

Assume we divide a large data set D into m different partitions of data in a distributed learning case. We do the training in the clusters and they are local experts. Now, we have some new test points that we want to assign to the partitions and obtain the predictions. I want to know which distance metric works faster when the number of the new points increases? Also, except distance metrics, it there any similarity-based measures that can be used to make a connection between new entries and available partitions?

asked Mar 04 '22 at 10:39

Ham82

It is a bit strange to start asking what measure is faster to compute, without first deciding on what can conceptually serve a proximity between a cluster and a point. Will that be a distance to centroid? medoid? nearest neighbour? etc. – ttnphns Mar 05 '22 at 08:56

fast distance metric between a new data entry and available clusters of data

0 Answers0