I am trying to train a neural network to classify chest X-ray scans as my final MSc project. I have a dataset of 13808 image, 3616 labelled COVID, 10192 labelled normal, so the ratio of COVID to normal images is 26.2/73.8. COVID is the positive class, Normal is the negative class. I am using keras to build a CNN and I am a bit overwhelmed by all the different metrics.
I have read that accuracy is a poor measure of performance, especially for imbalanced datasets, and that for medical imaging it is common to use sensitivity and specificity, as well as metrics like F1-score, AUC-ROC, and AUC-PR.
My reasoning is that minimizing false negatives, and therefore maximizing sensitivity/recall, is the priority in this context, as classifying someone as without COVID when they have it would cause the virus to spread. False positives are undesirable, as people would take unnecessary precautions, but not as important as minimizing false negatives.
I am a conversion student in computer science and so I am relatively new to machine learning and statistics. I would greatly appreciate any advice on how much of a problem the class imbalance is and what metrics would be most appropriate in this context. Thank you.