The title essentially says it all. Below are some details regarding my data and model.
This is the current class distribution within my training set:
0 1353849
1 26217
Name: binary, dtype: int64
My training set includes 104 features.
My current recall is at 94%; My current precision is at 20%
Here are the hyperparameters for my XGBoost model:
nrounds = 500, eta = 0.2, max_depth = 20, subsample = 0.8, colsample_bytree = 0.2,reg_alpha=0.1, reg_lambda=0.8
I've tried SMOTE but it isn't working well likely cause of the high dimensionality. If you all have any recommendations, that would be much appreciated.