What are the best ways to deal with imbalanced datasets for classifying whether or not individuals pay their tuition? The data is 75% positive class (paid) and 25% negative (unpaid). Some approaches I have read about include stratified k-folds , undersampling and oversampling, and synthetic data with approaches like SMOTE. One challenge I am currently facing is that my XGBoost classifier predicts almost all positives because there is a class imbalance leaning towards the positive class.
Instead of tackling the imbalance by modifying the data, can certain classification algorithms handle imbalanced data better than others?
Lastly, when is data considered imbalanced from a practical standpoint (60-40, 80-20, 95-5, etc.)? Essentially I am asking whether the mild cases of imbalance are still worth addressing, or only severe ones?