I have highly imbalanced data of 2 classes. For example, 4000 samples where the number of positive class is 20 samples.
My idea is:
Train = 2000 samples (50%: 10 positive samples and 1990 negative samples).
Test = 1000 samples (25%: 5 positive samples and 995 negative samples).
Validate = 1000 samples (25%: 5 positive samples and 995 negative samples).
From my understanding, I should draw sample without replacement for Test data. The rest will be use for preparing Train and Validate described by this diagram.
For Test data, do I need to sample the data like Train and Validate which divide data into the Support and Query set, or I just create only Query set?
From diagram, whether or not task1, task2, task3 in the red rectangle are the same thing with mini-batch?
May I have your suggestions?