We have a two group classification problem that is a giving some bizarre training results. I have tested this in both R and Python across multiple algorithms and gotten similar results but I am including only R SVM output below. In brief we are trying to predict group membership based on brain imaging data. Both region of interest (17 features) and ICA loading coefficients (60 features) give the same result for this data, with and without feature selection.
The crux of the problem is illustrated in the R output below:
Support Vector Machines with Radial Basis Function Kernel
544 samples<br/>
17 predictor<br/>
2 classes: 'ClassA', 'ClassB' <br/>
No pre-processing<br/>
Resampling: Cross-Validated (10 fold, repeated 5 times) <br/>
Summary of sample sizes: 490, 490, 490, 489, 490, 489, ... <br/>
Resampling results across tuning parameters:<br/>
sigma C ROC Sens Spec
0.005 0.5 0.5297579 1.0000000 0.000000000<br/>
0.005 1.0 0.5422166 1.0000000 0.000000000<br/>
0.005 1.5 0.5505414 1.0000000 0.000000000<br/>
0.005 2.0 0.5511248 1.0000000 0.000000000<br/>
0.005 2.5 0.5513616 1.0000000 0.000000000<br/>
0.005 3.0 0.5503681 1.0000000 0.000000000<br/>
0.005 3.5 0.5644331 1.0000000 0.000000000<br/>
0.011 0.5 0.5469811 1.0000000 0.000000000<br/>
0.011 1.0 0.5503466 1.0000000 0.000000000<br/>
0.011 1.5 0.5683659 1.0000000 0.000000000<br/>
0.011 2.0 0.5690950 1.0000000 0.000000000<br/>
0.011 2.5 0.5722130 1.0000000 0.000000000<br/>
0.011 3.0 0.5576593 1.0000000 0.000000000<br/>
0.011 3.5 0.5688788 1.0000000 0.000000000<br/>
0.200 0.5 0.5544983 0.9990244 0.000000000<br/>
0.200 1.0 0.5492066 0.9995122 0.000000000<br/>
0.200 1.5 0.5492004 1.0000000 0.000000000<br/>
0.200 2.0 0.5543411 1.0000000 0.000000000<br/>
0.200 2.5 0.5461315 0.9995122 0.000000000<br/>
0.200 3.0 0.5416475 1.0000000 0.000000000<br/>
0.200 3.5 0.5535835 1.0000000 0.001538462<br/>
ROC was used to select the optimal model using the largest value. The final values used for the model were sigma = 0.011 and C = 2.5.
As can be seen the model sensitivity equals one but the trained model has no specificity. This is similar for all algorithms in both packages. I feel like this data is trying to tell me something but I don't know what it is, so I am looking for insight if anyone has any to offer.