I am reading Elements of Statistical Learning (ESL) and trying to have more of a grasp of machine learning techniques. I am a little bit confused about when to treat predictors as fixed, and when to treat them as random, and what is most common?
In previous linear models courses I have studied, we never assumed any sort of randomness about the predictors $\textbf{x}$ of the linear model, only about the response $y$. Formally, if we can write $y=\textbf{x}^T\beta+\epsilon$ (or more generally, $y=f(\textbf{x})+\epsilon$), then we choose $\textbf{x}_1,...,\textbf{x}_n$ and from the random error $\epsilon$ obtain $y_1,...,y_n$. Making the predictors deterministic made sense to me, as we could think of this as sampling points in an experiment, and observing some random response. However, I don't think ESL ever explicitly say that they ever consider the predictors as fixed, but they implicitly do when they estimate $\beta$ via maximum likelihood estimation (their maximum likelihood estimator never takes into account the randomness of $\textbf{x}$).
ESL allows for the possibility of random predictors, which I have never seen before, but it makes sense: we might only observe the predictors, and not choose them. My question is this. Am I correct in saying that there are 4 different approaches to supervised learning:
- Random relationship, deterministic predictors: $y=f(\textbf{x})+\epsilon$, $\textbf{x}$ fixed (as described above)
- Random relationship, random predictors: $y=f(\textbf{x})+\epsilon$, $\textbf{x}$ random
- Deterministic relationship, random predictors: $y=f(\textbf{x})$, $\textbf{x}$ random
- Deterministic relationship, deterministic predictors: $y=f(\textbf{x})$, $\textbf{x}$ fixed
Many thanks. Let me know if you want me to clarify anything.
edit: I tried searching but couldn't find anything, but scrolled through the tags of "regression" and found this excellent answer by kjetil b halvorsen. I think I understand it now more or less, but if anyone has any further comments I would love to hear them.