In section 7.11 of Elements of Statistical Learning, an alternative notation for estimating $\gamma$ is given
$$\hat{\gamma} = \frac{1}{N^2}\sum_{i=1}^N\sum_{i'=1}^NL(y_i,\hat{f}(x_{i'})),$$
where $L(y,\hat{f}(x))$ is an arbitrary cost function.
E.g. consider multiclass classification with $C$ classes, where $y$ is a one-hot vector corresponding to the ground truth class membership, and $\hat{p}$ is the vector corresponding to the predicted probabilities. For cross-entropy loss
$$L(y,\hat{f}(x)) = -\sum_{c=1}^C y_c \ln(\hat{p}_c) $$
and similarly for mean-squared error
$$L(y,\hat{f}(x)) = \sum_{c=1}^C y_c(1-\hat{p}_c)^2$$.
Substituting these equations into the top equation gives
$$\hat{\gamma} = -\frac{1}{N^2}\sum_{i=1}^N\sum_{i'=1}^N\sum_{c=1}^C y_{ic}\ln(\hat{p}_{i'c})$$
and
$$\hat{\gamma} = \frac{1}{N^2}\sum_{i=1}^N\sum_{i'=1}^N\sum_{c=1}^C y_{ic}(1-\hat{p}_{i'c})^2$$
for cross-entropy loss and MSE respectively.
I.e. calculate the cost function for each feature vector $x$ in the dataset for every ground-truth observation $y$ in the dataset.
This is equivalent to weighting the cost function for each predicted class for each element by the ratio of the occurrence of that class, e.g. for cross-entropy loss
$$\hat{\gamma} = -\frac{1}{N}\sum_{c=1}^C\left(\frac{1}{N}\sum_{i=1}^N y_{ic}\right)\sum_{i'=1}^N \ln(\hat{p}_{i'c})$$
Furthermore, I have found an example in my research where the standard optimism bootstrap underestimates optimism for heavily overfit classifiers, even when using a proper scoring metric (log-likelihood loss). See here.