I am looking at some intuition to understand the impact of features on the accuracy of the classification algorithm. I compute accuracy by performing a 50-50 training-testing split on the dataset. The classification algorithm I am using is kNN. There are 2 queries I have:
- Assuming I have 2 features. Let's say I get an accuracy of 10% with feature 1 alone and an accuracy of 20% with feature 2 alone. For simplicity, I assume that the two features are independent. Further, the features have been scaled. What is the accuracy I should expect when I use both the features together. Are there any limits (minimum/maximum range)? Is there any theoretical support for these limits?
Specifically, can the accuracy improve drastically (as compared to accuracy with a single feature), when both features are used together? In one of my datasets, I see 4% accuracy individually with each feature; but together they give greater than 40% accuracy.
- Taking (1) further, is it possible that the accuracy in fact degrades (worse than 20%) when I use both the features together. In other words, under what circumstances might the accuracy degrade by providing more information.
Thanks, Girish