Comparing two Naives Bayes Classifiers (using different features of the same data) and using incremental k-fold cross-validation

Question

Possible Duplicate:
Comparing two classifier accuracy results for statistical significance with t-test

I coded two Naives Bayes Classifiers (using different features of the same data) and used incremental k-fold cross-validation.

As output I have computed (for each of the two NBC): For each k training set sizes: Average accuracy / Standard Error

Upon observing this data I formulate the hypothesis that one of the two has better performance (in terms of accuracy).

How can I assess the significance of my results in terms of whether I should reject the Null hypothesis in favor of my hypothesis or not?

Thanks for any help :)

You write a test that assesses the accuracy of the results, and run that test on each algorithm. Whichever one produces a higher index of accuracy is the winner. If you need a better answer than that, you need to ask a more focused, specific question. — Robert Harvey, Oct 29 '12 at 20:14
@RobertHarvey I don't think you read my question right; I already have the average accuracies and their corresponding standard error. I'm trying to come up with a proper way to do hypothesis testing using these values to figure out the significance of what I am observing. — , Oct 29 '12 at 20:31
Wouldn't the usual statistical analysis with Standard Deviation suffice to determine a confidence level? — Robert Harvey, Oct 29 '12 at 20:38
Maybe I should have mentioned that and I am sorry but I'm not a Stat person; I guess that what you called "usual statistical analysis with standard deviation" is what I'm looking for, could you elaborate? — , Oct 29 '12 at 20:44
From a statistics point of view, you *need* to do another test, to validate your results. You must not base your hypothesis on your result and use it again to validate the hypothesis. If you used your observations to formulate an assumption, validate it in an *independent* experiment! — Has QUIT--Anony-Mousse, Oct 29 '12 at 22:43

score 0 · Answer 1 · answered Oct 29 '12 at 21:03

0

I am not a stat person, but I think the Kolmogorov-Smirnoff test might be useful to you ( or any test for goodness of fit for that matter ).

answered Oct 29 '12 at 21:03

lucasg

Why using a goodness-of-fit test (and the KS test is used to compare two distributions) when the question is about comparing two proportions (accuracy)? – chl Oct 30 '12 at 15:18
I though you just wanted to assert that one method was better than the other. The [Welsch t-test](http://en.wikipedia.org/wiki/Welch's_t_test) give you a p-value for the null hypothesis if your populations follow the [student's t-distrib](http://en.wikipedia.org/wiki/Student's_t-distribution). – lucasg Oct 30 '12 at 16:40
Not me, but the OP :-) And not quite there since Welch's test is used for comparing two sample means assuming unequal population variances (see the [Behrens–Fisher](http://en.wikipedia.org/wiki/Behrens–Fisher_problem) problem). – chl Oct 30 '12 at 23:13

1 Answers1