Consider a case where the number of labelled data as 0 = 1400 and labelled as 1 =100. This dataset is imbalanced with a majority examples belonging to normal class (0) and minority being class labeled as 1.
The data labelled as 0 denote normal operating conditions and data labelled as 1
denote abnormal. I have considered 1
as Positive class and 0
as negative class.
Assuming the following confusion matrix is obtained for the binary classification
cmMatrix =
predicted 0 predicted 1
truth 0 1100 (TN) 300 (FP)
truth 1 30 (FN) 70 (TP)
cmMatrix = [1100,300;30,70];
acc_0 = 100*(cmMatrix(1,1))/sum(cmMatrix(1,:));
acc_1 = 100*(cmMatrix(2,2))/sum(cmMatrix(2,:));
will give acc_0 = 78.5714
and acc_1 = 70
The confusion matrix is read as out of 1400 normal events, 1100 are correctly identified as normal and 300 are incorrectly identified as abnormal. Then, out of 100 abnormal events, 70 are correctly detected as abnormal whereas 30 are incorrectly detected as abnormal. I want to calculate the sensitivity and specificity for class 1 since that is of primary interest in abnormal event detection. This is how I did
Sensitivity = TP/(TP+FN) = 70/(70+30 ) = 0.70
Specificity = TN/(TN+FP) = 1100/(1100+300) = 0.78
Q1) In this example, the sensitivity for class 1 = accuracy for class 1. Is it always this case that the individual sensitivities for each class will be equal to its individual class accuracies?
Q2) Precision for class 1: TP/Predicted true = 70/(70+300) = 0.18
which is very low than class 1's accuracy. Is precision not connected to individual class accuracy?
Q3) For imbalanced dataset, how do we say that the classifier has done a good job?