In Section 7.2 of Hastie, Tibshirani, and Friedman (2013) The Elements of Statistic Learning, we have the target variable $Y$, and a prediction model $\hat{f}(X)$ that has been estimated from a training set $\mathcal{T} = \{Y_1, ..., Y_N, X_1, ..., X_N\}$. The loss is denoted $L(Y, \hat{f}(X))$, and then the authors define the test error: \begin{equation} \mathrm{Err}_{\mathcal{T}} = \mathbb{E} \left[ L(Y, \hat{f}(X)) | \mathcal{T} \right] , \end{equation} and the expected test error: \begin{equation} \mathrm{Err} = \mathbb{E} (\mathrm{Err}_{\mathcal{T}}) . \end{equation} The authors then state:
Estimation of $\mathrm{Err}_{\mathcal{T}}$ will be our goal...
My question: Why do we care more about $\mathrm{Err}_{\mathcal{T}}$ than $\mathrm{Err}$?
I would have thought that the quantity that measures expected loss, regardless of the training sample used, would be more interesting than the expected loss that conditions on one specific training sample. What am I missing here?
Also, I've read this answer here which (based on my possibly incorrect reading) seems to agree with me that $\mathrm{Err}$ is the quantity of interest, but suggests that we often talk about $\mathrm{Err}_{\mathcal{T}}$ because it can be estimated by cross-validation. But this seems to contradict Section 7.12 of the textbook, which (again by my possibly incorrect reading) seems to suggest that cross-validation provides a better estimate of $\mathrm{Err}$ than $\mathrm{Err}_{\mathcal{T}}$.
I'm going around in circles on this one so thought I would ask here.