I have used a simple multivariable logistic regression (as you would get by default glm() with logit in R) in a problem of dichotomic classifier with approx. 100 predictors, i.e. quite a lot of variables in the regression. My approach was quite naive, just wanted to quickly see some classification results, and they were not that bad, about 85% classification accuracy.
Many of the variables' intercepts have reasonably low p-value (<.05) but there are quite a few of them that have p-value >0.6, sometimes even 0.8. I would intuitively say to myself never mind these variables and get rid of them because you can't be sure that the estimated intercepts are reasonably correct. But when I removed randomly couple of the variables whose intercept p-value was high (e.g. >0.5), the eventual classification accuracy dropped (even on validation dataset).
Should I then keep those variables in the model just because of the classification accuracy? Even though it's quite likely their intercept is just 0 (or simply quite different than the one estimated by the model)? Why is this happening? Maybe I'm overfitting the model? Or maybe I'm just freaking out because there's something peculiar in the way how to properly interpret p-values that I don't know about? Or is it just a coincidence that I happened to come across with my particular training/testing datasets, or does this (i.e. leaving out presumably insignificant variables while causing classification deterioration) happen more often in similar situations?
Thank you very much in advance.