I am applying two diagnostic tests with dichotomous outcomes (condition present yes/no) to a sample of patient data. Both take into account a range of vital parameters and produce a test result, which is then compared to the gold standard, such that the usual statistics (sensitivity, specificity, etc.) can be calculated. One test seems to perform somewhat better than the other in terms of the area under the curve.
Is there any way to test if this modest improvement in diagnostic quality is statistically significant? I have come across McNemar's test for paired nominal data, which is applicable to the 2x2 contingency table, but I am wondering if it is also appropriate and meaningful in this case. Most explanations and examples seem to pertain to pre- and post treatment comparisons, which is not the case here.