Question: For the layman, what does it mean for data (say $n$ samples covering $m$ variables) to be identically distributed, and how is it practically achieved when conducting machine learning?
So let's say each sample measures X and Y predictors, and I want to predict outcome Z.
To satisfy IID, should I be aiming for $X$ and $Y$ to look the same when I plot their estimated density functions? Or does it mean something else entirely?
I have read about "normalizing" your sample data by subtracting the mean and dividing by the standard deviation. Is the purpose of this to fulfill IID?
The context of my question is machine learning, specifically XGBoost for a regression problem. It is performing in a mediocre to poor fashion and I want to understand why. Perhaps my data is not IID. I thought it didn't matter for tree-based models, but I am obviously missing something.