Intuition
In ML, as I constantly run into the i.i.d assumption for datasets, I have an intuition of what this assumption really means. So if I'm not mistaken:
- "independent" means that samples (rows of a tabular dataset) do not carry information of each other. A violation of independence might be when multiple samples of a few patients build a dataset, where multiple rows coming from the same patient are correlated.
- "identically distributed" points to the fact that the samples must come from the same joint distribution, and a case of violation might be when the cases are sampled over a long period of time (a shift in underlying data generation mechanism) with no time indicator (I hope these intuitions are correct)
Formal Definition
Then I came across a formal definition of i.i.d assumption in Blitzstein & Hwang's "introduction to Probability" which is stated for random variables:
Random variables are independent if they provide no information about each other; they are identically distributed if they have the same PMF [or PDF] (or equivalently, the same CDF). Whether two r.v.s are independent has nothing to do with whether or not they have the same distribution.
So they are talking about i.i.d random variables not samples. the examples are: X result of first dice roll and Y result of second dice roll are i.i.d. . whereas X result of dice roll and Y price of a stock market index is independent but not identically distributed. or X number of heads in n trials and Y number of tails in the same trial are dependent but identically distributed as both are simulated by $\text{Bin}(n, 1/2)$.
Question
So what is actually i.i.d referring to: samples or variables? clearly my intuition and the formal definition do not match. are both true and distinct matters? is there something wrong here?!
p.s. I even found a third explanation on i.i.d in this question which says:
Hence the classical assumption of is with respect to ∼, not the set {,}.
where $u$ is the additive noise term... so what about this?!