Which are the possible caveats of using a Bagging algorithm (such as Random Forest), when data are not independent?
Ensemble models usually exploit Bagging to reduce the variance by aggregating several models built on bootstrap samples of the original dataset. However, observations may be dependent, as it happens with time-series or longitudinal data.
On a theoretical point of view, is this a problem?
Thank you in advance!