I'm reading about test/generalization error in Hastie et al.'s Elements of Statistical Learning (2nd ed). In section 7.4, it is written that given a training set $\mathcal{T} = \{(x_1, y_1), (x_2, y_2), \ldots, (x_N, y_N)\}$ the expected generalization error of a model $\hat{f}$ is $$Err = E_{\mathcal{T}}[E_{X^0, Y^0}[L(Y^0, \hat{f}(X^0))|\mathcal{T}]],$$
where the point $(X^0, Y^0)$ is a new test data point, drawn from $F,$ the joint distribution of the data.
Suppose my model is a linear regression (OLS) model, that is, $\hat{f}(X) = X\hat{\beta} = X(X^TX)^{-1}X^TY$, assuming that $X$ has full column rank. My question is, what does it mean to (1) take the expected value over $X^0, Y^0$, and (2) take the expected value over the training set $\mathcal{T}$?
For example, suppose $Y = X\beta + \epsilon$, where $E[\epsilon]=0, Var(\epsilon) = \sigma^2I.$
(1) Consider evaluating $E_{X^0, Y^0}[X_0\hat{\beta}|\mathcal{T}]$, is the following correct?
\begin{align*} E_{X^0, Y^0}[X^0\hat{\beta}|\mathcal{T}] &= E_{X^0, Y^0}[X^0(X^TX)^{-1}X^TY|\mathcal{T}]\\ &= E_{X^0, Y^0}[X^0|\mathcal{T}](X^TX)^{-1}X^TY\\ &= E_{X^0, Y^0}[X^0](X^TX)^{-1}X^TY \end{align*}
The last equality holds if $X^0$ is independent of the training set $\mathcal{T}$.
(2) Consider evaluating $E_{\mathcal{T}}[X^0\hat{\beta}|X^0]$, is the following correct? \begin{align*} E_{\mathcal{T}}[X^0\hat{\beta}|X^0] &= X^0 E_{\mathcal{T}}[(X^TX)^{-1}X^TY|X^0]\\ &= X^0 (X^TX)^{-1}X^TE_{\mathcal{T}}[Y|X^0]\\ &= X^0 (X^TX)^{-1}X^TX\beta \end{align*}
The second equality holds assuming that the covariates $X$ are fixed by design, so the only thing that's random with respect to the training set $\mathcal{T}$ is $Y$, correct?