I'm trying to make a model for Credit Card Fraud Dataset. I used a combination of under/oversampling to balance out the data. I ran a NN model and I tuned it with keras-tuner
. the best val_auc
I got is around 98%
. but when I run the unbalanced test data I get around 53%
AUPRC.
Report:
precision recall f1-score support
Not Fraud 1.00 0.90 0.95 56861
Fraud 0.02 0.87 0.03 101
accuracy 0.90 56962
macro avg 0.51 0.89 0.49 56962
weighted avg 1.00 0.90 0.95 56962
what can be the reason for this and what can I do to make my model better?
This is the kaggle
notebook that I'm working on and has the code: Kaggle Notebook