I ran logistic regression on a data of 3700 patients. I have 9 variables and my outcome is presence of a disease or not. I got the regression coefficients and predicted probabilities. When I apply this model on another data set, no matter what I do the area under ROC curve does not go above 56%.
I am assuming there is underfitting in my model. How can I improve this and reduce the high bias? Any way to calculate the bias in a software? How can I fix this underfit in a software?
Thank you very much to anyone who provides a solution.