Statistically significant comparison of classifiers

Question

I am working on a movie review sentiment analysis project, and comparing various classifiers on the same dataset. The data for the two classes is balanced, so I'm using accuracy on 3-fold cross validation as the basic measure of performance.

How can I check whether one classifier is better than another, with statistical significance? Is this a test I can do directly on the accuracy values, or do I need multiple accuracy values (from the multiple folds), or even individual classifications for each instance of data? Is the pair test applicable here?

Details: Dataset: 1000 positive, 1000 negative reviews. Bag of unigrams (words). Classifiers: Naive Bayes, SMO and LogReg Evaluation: Single accuracy percentage at the end of 3-fold stratified cross validation for each of the classifiers.

score 1 · Answer 1 · edited Apr 13 '17 at 12:44

1

This question on CV How to statistically compare the performance of machine learning classifiers? might help you. I recommend the literature pointed out by in the second (accepted) answer.

edited Apr 13 '17 at 12:44

Community

1

answered Jan 20 '15 at 00:51

Jacques Wainer

5,032
1
20
32

Statistically significant comparison of classifiers

1 Answers1