I am trying to determine wether my classifier has obtained a statistically significant result. The problem: Classify a heartbeat into 2 classes either "normal" or abnormal"
Let's say I have the following Confusion Matrix Generated for model A
I found this post Comparing two classifier accuracy results for statistical significance with t-test asking a similar question and am using the post by Ébe Isaac to try and solve this however I am running into an issue.
Work so far:
let α = 0.1
Let p1 be the probability of model A
Let p2 be the probability of model B (which always guesses "normal")
In this case p1 = (404 + 132)/(648) = 0.827
p2 = (404)/(648) = 0.623 because p2 always guesses "normal"
Is the result of p1 statistically significant
Ho: p = 0.623
Ha: p > 0.623
From this cross validated post Comparing two classifier accuracy results for statistical significance with t-test asking a similar question by Ébe Isaac
p Hat = (404 + 536)/648 = 1.45
Z = (0.827 - 0.623)/sqrt(2 * 1.45 * (1-1.45)/648)
The issue is that this gives me an error since the square root is negative!
Can someone please help explain why I am getting this error and show me the steps to complete the problem? Thanks!