Performing cluster analysis, I have a reference for the results and the results of two methods A and B. I am able to calculate fitness metrics (like adjusted mutual information) between the reference and either result or I can even test for significant independence between the reference and either result, using the G-test or the Chi-squared test. The G-test can be formulated using mutual information.
Please note, that clustering results are not to be confused with classification results, such that in classification 1,1,1,2,2,3,3
would be a different results than 1,1,1,3,3,2,2
, but in clustering those would be identical results.
I found out that method A is better than method B, but I want to know whether it is statistically significantly better using a significance test (G-test or Chi-squared test). But how can I test that? I would have two contingency tables - one for the comparison of the reference with the results of method A and one for the comparison of the reference with the results of method B. I am thinking to see if they are significantly different. If I had classes (as in classification), I could just treat each cell in one contingency table as expected value and the cells of the other table as the observed values and perform the G-test. However, while the reference results are the same for both tables (e.g. the row marginal entropy), the results of the method (e.g. the column marginal entropy) are not the same.
It is not really clear what "observed" values are to be compared with which "expected" values.
Maybe I need interaction information for all that, but I cannot figure out how.
Maybe information gain is an approach.
Another idea is, to use the G-test to calculate the p-value for the hypothesis that a method A/B is independent of the reference. Let's say that values are $p_B$ and $p_A$. Since method A is "more not independent" than method B, $p_B > p_A$. Then one could calculate $p_B - p_A$ and try to interpret that, but I am not sure how, and whether that is actually interpretable.
As one can see, I am somewhat in the dark how to approach this.
I think, but am not sure, that what I am asking is different than https://online.stat.psu.edu/stat504/lesson/5 (and thus https://stats.stackexchange.com/a/147980/83252). I think this, because, there they compare the whole three-way contingency table with the expected-value table of independent frequencies. However, I want to compare only "two sides" of the three-way table with each other (so to speak).
If I get it right, $\chi^2 $ of multidimensional data is about having a three-way table and then checking for mutual (complete) independence of the variables X, Y, Z. At least, this is what the example in stats.stackexchange.com/a/147980/83252 suggests. However, is it possible that my question can be rephrased, as the question, if P(X,Y) and P(X,Z) are independent? Maybe that can be a path to an answer to my original question?