I have a set of samples with two labels red and black. I can build a logistic regression model to predict the label colour. Once a model is built, I would like to test whether it is overfitting or not.
Normally, I will set aside, say, 30% of my sample as out-of sample. Build a logistic regression model on the 70% development sample and test its performance (for example, Gini) on the out-of sample. If I see Gini drops from development sample to out-of sample, I know that the model may be overfitted.
However, when my sample has small number of red (or black or even both), it is reluctant to set aside some out-of sample for validation purpose. I'd rather use as much data as possible given a limitation. So what are some effective validation methods other than the one I describe above that can be used?
Note that here I am not trying to determine the best model form (as logistic regression is the one to use), or to determine best value of some tuning parameters (I don't need to determine the number of variables to use). I simply have got a model and want to test whether it is overfitting the training sample or not. So I don't think cross-validation would be applicable here.