In Hastie & al's book Elements of Statistical Learning, there are two subsections covering insample prediction errors and optimism bias (section 7, p.228-230).
Hastie & al explain that defining the insample prediction error as
(1),
we can take the expected value of this quantity w.r.t. $Y$ and given $X$ and get
(2)
with $\bar{err}$ the empirical loss. The empirical counterpart would be
which gives us a framework to derive/compare some model selection criteria, especially Mallow's $C_p$.
I am looking for a book or an article that follows the same logic. Especially, I am interested in finding alternative presentations in which the covariance term appears.
Two underlying motives for my question are 1/ to make sure I get Hastie & al's point right and get as much details on it as I can and 2/ to check whether their logic is commonplace, or whether this framework (esp. formula 2) is their's only.