I use the Chi-square test for feature selection. I use it only when all entries in the contingency table are greater then 5.
Is that the correct approach statistically?
What happens for example, if there's a feature that appears 1000 times only in positive examples? It seems that it should pass the test. Am I using it wrong?