3

Evaluating the classifier I implemented for university, I am observing an AUROC (Area under curve of the ROC) of 1.0 (which means a TP rate of 1 and a FP rate of 0.0)

The dataset used for training were captured independently from the dataset used for evaluation.

Nevertheless I am hesitating to show up with this AUROC measure.

How should I interpret an AUROC value of 1.0 in respect to the general performance of the classifier? Is it overfitting if a different dataset (which is the real-world scenario) is used for testing? Does regularization makes sense?

Firebug
  • 15,262
  • 5
  • 60
  • 127
Jan
  • 151
  • 6

2 Answers2

1

If the algorithm never saw it, then no, that's not overfit. Overfitting means an algorithm fits the train data (and validation data, if it was used) really really well, while it has really bad generalization.

Now, you might ask if the test sample is a good representation of the population. If it is, then an $\text{AUC} = 1$ is the generalized performance.


Another possibility is algorithmic error (I had something similar happen to me, not kidding), so check your calculations again.

Firebug
  • 15,262
  • 5
  • 60
  • 127
  • It is the WEKA `Evaluation.weightedAreaUnderROC()` so I think calculations might be correct.. I might check how much the trainset really differs from the testset... Thank you – Jan Jul 18 '16 at 23:54
1

AUROC equal to 1.0 (assuming it was computed correctly) on the test set means either that your classifier managed to learn the task very well (assuming that the test set is varied enough to decently represents the kind of samples your classifier will be used with in the future), or that your testing data leaked into your training data (a.k.a. data contamination).

FYI:

Franck Dernoncourt
  • 42,093
  • 30
  • 155
  • 271
  • (+1) For data leakage. I came here to post just that comment. It could also be that the response leaked into the predictors. – Matthew Drury Jul 19 '16 at 02:19
  • Thank you for this.. I've never heard of it before.. And I so not fully understand that. If i do a grid search.. And in every iteration I build a completely new classifier (in terms of programming declaration/initialization) how can the prior used testset not be completely new for the classifier? – Jan Jul 19 '16 at 11:24
  • @Jan E.g., bug in the code that takes care of creating the train and test sets. – Franck Dernoncourt Jul 19 '16 at 13:53
  • The question was regarding the data leakage.. – Jan Jul 19 '16 at 14:15
  • @FranckDernoncourt That seemed to happen.. I hate call by reference and trying to copy a List in JAVA... Thank you for making me questioning that possibility.. – Jan Jul 20 '16 at 10:18
  • @Jan np, been there. – Franck Dernoncourt Jul 20 '16 at 19:23