When calculating the r^2 of some model on some testing set, we're effectively comparing the MSE of that model's predictions to the MSE of some naive, base-line model that always predicts the sample mean of the target variable in the test set.
But that naive model does soemthing that the actual model can't do - it "peeks" into the test data (to see its target variable sample mean).
If the test data's sample mean might be significantly different than the training data's sample mean, it looks natural to define another statistic, where that naive model uses the train data's sample mean instead of the test data.
Effectively, this definition just replaces $r^2 = 1- {MSE \over {\sigma^2_{y_{test}}}} $ with $r^2 = 1- {MSE \over {\sigma^2_{y_{test}}+({\bar y_{test}} - {\bar y_{train}})^2}} $, giving slightly higher values (derivation below).
Is there a name for this statistic? Is the main point behind it ("don't let your 'baseline' peek into the test data") covered by some other statistic?
The derivation is as follows: $$ r^2_{out-of-sample} = 1- \dfrac{MSE}{{1 \over N_{test}} \sum_{i \in \text{Test Group}} (\bar{y}_{train} - \hat{y}_i)^2} $$ Where $MSE = {1 \over N_{test}} \sum_{i\in\text{Test Group}}(y_i-\hat{y}_i)^2$ (as in @Dave's answer below).
The expression in the denominator is equal to $\sum_{i \in \text{Test Group}} (\hat{y}_i-\bar{y}_{test}+\bar{y}_{test}-\bar{y}_{train})^2$, which can be expanded to ${1 \over N_{test}} \sum_{i \in \text{Test Group}} ((\hat{y}_i-\bar{y}_{test})^2+(\bar{y}_{test}-\bar{y}_{train})^2-2(\hat{y}_i-\bar{y}_{test})(\bar{y}_{test}-\bar{y}_{train}))$, whose first term is $\sigma^2_{y_{test}}$ and last term vanishes.We also note that $r^2_{out-of-sample} = {{r^2+(\bar{y}_{test}-\bar{y}_{train})^2} \over {1+(\bar{y}_{test}-\bar{y}_{train})^2}}$