I'm trying to gain an intuition as to why increasing the number of features could reduce performance. I'm currently using an LDA classifier which performs better bivariately among certain features but worse when looking at more features. My classification accuracy is performed using a stratified 10-fold xval.
Is there a simple case of when a classifier would work better univariately than bivaraiately to gain a somewhat physical or spatial intuition of what is happening in these higher dimensions?