I am building a predictive model with 7 features. My target is binary. I have tried using XGBoost
in R.
bst <- xgboost(data = as.matrix(trainSet[,predictors]),
label = trainSet[,outcomeName], max.depth=10,
nround=1000, objective="reg:linear", verbose=0)
pred <- predict(bst, as.matrix(testSet[,predictors]), outputmargin=TRUE)
rmse(as.numeric(testSet[,outcomeName]), as.numeric(pred))
Since my target is binary, I used logistic regression. But the prediction quality goes very bad compared to linear regression. Is it OK to use linear regression or am I overfitting? Misclassification is 23.4% with linear regression but it goes 47% with logistic.