Determine if a binary classifier has achieved a statistically significant accuracy

Question

I am trying to determine wether my classifier has obtained a statistically significant result. The problem: Classify a heartbeat into 2 classes either "normal" or abnormal"

Let's say I have the following Confusion Matrix Generated for model A

I found this post Comparing two classifier accuracy results for statistical significance with t-test asking a similar question and am using the post by Ébe Isaac to try and solve this however I am running into an issue.

Work so far:

let α = 0.1

Let p1 be the probability of model A

Let p2 be the probability of model B (which always guesses "normal")

In this case p1 = (404 + 132)/(648) = 0.827

p2 = (404)/(648) = 0.623 because p2 always guesses "normal"

Is the result of p1 statistically significant

Ho: p = 0.623

Ha: p > 0.623

From this cross validated post Comparing two classifier accuracy results for statistical significance with t-test asking a similar question by Ébe Isaac

p Hat = (404 + 536)/648 = 1.45

Z = (0.827 - 0.623)/sqrt(2 * 1.45 * (1-1.45)/648)

The issue is that this gives me an error since the square root is negative!

Can someone please help explain why I am getting this error and show me the steps to complete the problem? Thanks!

score 1 · Answer 1 · answered Feb 19 '18 at 10:07

1

Let's say that you found that your result is "significant" but it is unacceptable from business, or clinical, point of view (it is too small to be safe, it leads to too big loses etc.), what then? Assessing that it is not "bad" does not make the result "good". The hypothetical test would tell you that the result differs from random predictions, but "better then random" usually does not equal "good". What if you used two classifiers and both were significant? Statistical significance is not meant to be a measure of something being "good" or "bad".

answered Feb 19 '18 at 10:07

Tim

108,699
20
212
390

I totally agree! I just want to corroborate the benchmarks I already have with this model by making sure that the accuracy is statistically significant. Do you know where I made a mistake above and how I can fix it? – Sreehari R Feb 19 '18 at 10:18
@SreehariR see https://stats.stackexchange.com/questions/192291/variable-selection-using-cross-validated-pls-model-when-permutation-test-shows-l , you can also use bootstrap in here. – Tim Feb 19 '18 at 11:29
I am still not sure where I went wrong – Sreehari R Feb 19 '18 at 18:38

Determine if a binary classifier has achieved a statistically significant accuracy

1 Answers1