Is this a typo in Stone's (1977) paper on asymptotic equivalence between AIC and LOOCV?

Question

I am unsure about an expression in Stone's (1997) paper on asymptotic equivalence between AIC and LOOCV. Section 4., third line from the bottom of page 45 starts with $L(\theta)-1(y_i|x_i,\theta)$. The second part of this expression is puzzling to me.

What does $1$ stand for? An indicator function?
Or should it actually be $l$ rather than $1$, meaning the likelihood of a single observation $l(y_i|x_i,\theta)$?

References

Stone, M. (1977). An asymptotic equivalence of choice of model by cross‐validation and Akaike's criterion. Journal of the Royal Statistical Society: Series B (Methodological), 39(1), 44-47.

Related question: [Equivalence of AIC and LOOCV under mismatched loss functions](https://stats.stackexchange.com/questions/406430/). — Richard Hardy, May 08 '19 at 14:39
Related question: [Example and counterexample for Stone's (1977) assumption](https://stats.stackexchange.com/questions/407291/). — Richard Hardy, May 08 '19 at 15:00
with no expertise, it seems clear that it is letter l, since it is the log likelihood removing the ith data point .. — seanv507, May 08 '19 at 15:19
@seanv507, That is exactly what I though. Good to get some confirmation. — Richard Hardy, May 08 '19 at 15:26

score 1 · Accepted Answer · answered May 10 '21 at 09:34

This is definitely a typo. Note that $\ell$ stands for the log-likelihood, $S=\{(x_i,y_i)\}$ for the training data, $S_{-i}$ for the training data with the $i$-th entry removed (defined right before equation (3.3)), and $$ L(\theta) = \sum_j\ell(y_j|x_j,\theta).$$

Finally, $\hat\theta(S)$ ("$\hat\theta$ for short") is defined as the maximizer of $L(\theta)$.

We are considering "$\hat\theta(S_{-i})$ ($\hat\theta_{-i}$ for short)". Per the definition of $\hat\theta(S)$, this is the maximizer of $L(\theta)$, but based on $S_{-i}$ instead of $S$, or

$$ \sum_{j\neq i}\ell(y_j|x_j,\theta) = L(\theta)-\ell(y_i|x_i,\theta).$$

And Stone (1977) writes $L(\theta)-1(y_i|x_i,\theta)$ instead of the last expression. So the $1$ should be an $\ell$ here.

(Another argument for using $\ell$ instead of $l$. Some things do get better.)

Is this a typo in Stone's (1977) paper on asymptotic equivalence between AIC and LOOCV?

1 Answers1

Linked