We are conducting a predictive analysis (classic binary outcome classifier i.e. patient got the disease or not) from longitudinal/panel data (from each patient we have 1 or more observations depending on whether the patient left the study earlier or not).
The outcome [disease yes (positive class) or no (negative class)] it's unbalanced towards the negative class being way more represented than the positive class.
Now the question is: would make any sense to exploit any oversampling (e.g. SMOTE et similia) techniques in order to balance the outcome classes given that we do have longitudinal/panel data (SMOTE/oversampling ignoring the correlation amongst observations from the same patient might just introduce more noise in the analysis?).
Thanks a lot in advance for your support.