I have a classifier Y that selects between three categories: A, B and C.
I need to be able to quantitatively prove that my model is better (and by how much?), than a random classifier R that randomly picks between categories A, B and C.
I intend to proceed as follows:
- Generate classifications using classifier R
- Generate a confusion matrix for output of classifier R
- Generate classifications using classifier Y
- Generate a confusion matrix for output of classifier Y
However, having generated the two confusion matrices above, I'm not sure how to use them to solve my problem.
The "intuition" behind using the confusion matrices is that I can "visually" check and "compare" the sensitivity, specificity etc between the models etc.
I would like to be able to use the confusion matrices (if possible) to do some test the null hypothesis that the BookMaker Informedness of Y is no better than R
Can anyone help with how I can test this hypothesis, given data from the two confusion matrices?