0

I know the main concepts of data/text mining but I used them mainly in binary classification problems (just two classes). I am now dealing with a problem with 8 classes and struggling to calculate an evaluation metric, like precision, recall etc.

Can I convert a multi-class confusion matrix to a binary confusion matrix with TP, FP, TN FN values and then calculate the aforementioned metrics? If not, is there an alternative evaluation metric for a classifier when i got a confusison matrix like this:

enter image description here

andrealmeida
  • 3
  • 1
  • 3

1 Answers1

1

Welcome to the website, this is a variation of a commonly asked question. You can definitely convert a multi class matrix to a binary conf matrix.

Below is some R code on how you can collapse a confusion matrix to a binary one. It also calculates Cohen's kappa to get the overall 'rater' agreement between the classifeir and the actual class (of cmg).

cmg <- matrix(c(1639, 116, 49, 35, 138, 0, 0, 236,
                 150, 274, 27, 21,  28, 0, 0,  73,
                  22,  24, 58,  9,  94, 0, 0,  30,
                  33,  27, 31, 21, 146, 0, 0,  49,
                  14,   9,  5,  1,  49, 0, 0,  22,
                   1,   0,  1,  1,   7, 0, 0,   6,
                  11,   0,  0,  1,  14, 0, 0,  21,
                 201,  11,  8,  5,  49, 0, 0, 253), 
              ncol=8,dimnames = rep(list(("T1","T2","T3","T4","T5", "T6","T7", "T8")),2))

require(psych)

# Overall agreement
overall_agg <- sum(diag(cmg))/sum(cmg)

# Overall Cohen's Kappa for cmg
unweighted_kappa <- cohen.kappa( cmg, n.obs=sum(cmg) )

# initialise containers
spec_agr_guideline <- list()        
collapsed_mat_guideline <- list() 
unweighted_kappa_psych <- list()

# loop through all treatments    
for (i in seq(1,nrow(cmg)) ) {
  # Specific agreements
  spec_agr_guideline [i] <- 2*cmg[i,i] / (sum(cmg[i,]) + sum (cmg[,i]) )
  # Collapsed positive agreement confusion matrices per treatment
  collapsed_mat_guideline[[i]] <- matrix(c(cmg[i,i],             sum(cmg[i,])-cmg[i,i],
                                         sum(cmg[,i])-cmg[i,i],  sum(cmg)-sum(cmg[i,])-sum(cmg[,i])+cmg[i,i]), 
                                       ncol=2)
  # Calculate unweighted Cohen's Kappa per collapsed (binary) confusion amtrix
  unweighted_kappa_psych[[i]] <- cohen.kappa( collapsed_mat_guideline[[i]], n.obs=sum(collapsed_mat_guideline[[i]]) )

Furthermore, you can do some other cool stuff to assess the performance of a multi-class classifier. Some relevant answers from CrossValidated.com are: link1, link2, link3.

Zhubarb
  • 7,753
  • 2
  • 28
  • 44
  • i was unaware of this Cohen's Kappa metric. i'll read about that. thnks – andrealmeida Aug 14 '14 at 18:16
  • so, i can define the overall precision from the classifier as the average precision of each class, right? When i say this to precision, i say the same for all other metrics – andrealmeida Aug 14 '14 at 19:17
  • Do you necessarily need overall precision / accuracy? You could just report them by class. That contains more information. – Zhubarb Aug 15 '14 at 07:11