Note: read down to below "Question" to find the question.
Background:
In a previous question I asked how to group what I would call nodes on a network graph based on a connectivity matrix. (link)
The nomenclature used for a group of points was "cluster", "community" or "clique".
The general question behind the question is a fundamental in numerics which is "what is self" or "how is a chicken like a sphere". As cognitively dissonant as the hyperbole sounds - it allows image recognition, finite element and finite difference methods. It lives behind the idea of a limit.
When are two distinct things close enough to be considered essentially the same?
Here is the sketch that I used:
Let's say that I had two separate graphs, one with the node labels 1...10, and another with labels a...j but otherwise the same connectivity. Outside of labels they would be the same graph.
What if I added a $k^{th}$ node, or broke a link. They would otherwise be quite similar.
Question:
My question is:
- How do I measure a general "distance" between two graphs?
The answer isn't going to be meaningful if it doesn't in some way address the following:
- What if they do not have the same number of "nodes"?
- Is there something like "image registration" for graphs?
I am not looking for the perfect solution to an unsolved problem. Candidate sub-optimal semi-hacks that work in some cases are the "soup de jour" of this question.
One option is to say "they have to have the same number of nodes, you contrive it as a time-series via some characteristic spanning walk, and then use time-series metrics to compare one graph to another. I don't know if such a thing exists.
Image registration allows sub-pixel overlaying of otherwise dis-similar images of the same object. It uses cross-correlation functions - very fast using FFT methods, and interpolation around the "peak". An analog is to move the graph to image space, then perform image registration.
A general measure might not be about moving to a different domain and using domain-specific methods. It might be a topology solution.
Explanation, references, or sample code are always welcome. Same goes for comments and suggestions.