I have many paths which come from the same graph. I am trying to cluster these paths. First, I thought of using simply the Levenshtein distance.
The problem is that two very short paths which do not have any node in common have a smaller distance than a very short and very long path which contains the shorter path (e.g., A-B-C
, X-Y-Z
, A-B-C-D-E-F-G-H-I-J-K-L
).
I would like to cluster the paths when they have nodes / waypoints in common. Also some nodes can be more important than others.
I am not very familiar with standard distance metrics in the mentioned case. What would be a good distance metric? Do you have some good resources for me?