I have a very unbalanced dataset(99.8% negative,0.2% positive) with approximately 60 variables. I removed somewhere around 40 variables based on the variance inflation factor. Then I used SMOTE to oversample the minority data.
I am now using XGBoost to prepare a model. I have tried class weighting, regularization and adjusting variables using Randomized Search CV for other parameters. However my model still massively overfits, F1 score is approx 0.5 , which precision and recall both being dramatically reduced. How do I reduce overfitting in this scenario?