3

I am currently reading Jun Shao's Mathematical Statistics, and in his discussion of U statistics, he proves that

$Var(U_n) = $ $n\choose m$$^{-1} \sum_{k=1}^m $$m \choose k$$n - m \choose m-k$$\zeta_k$

where $\zeta_k$ is the variance of the conditional expectation of the kernel conditioning on $X_1,\ldots,X_k$, n is the sample size, and m is the number of arguments to the kernel function. He then states the following three facts as a corollary:

(i) $\frac{m^2}{n}\zeta_1 \leq Var(U_n) \leq \frac{m}{n}\zeta_m$

(ii) $(n+1)Var(U_{n+1}) \leq nVar(U_n)$

(iii) For any fixed m and $k = 1,\ldots,m$, if $\zeta_j = 0$ for $j < k$ and $\zeta_k > 0$, then

$Var(U_n) = \dfrac{k! {m\choose k}^2\zeta_k}{n^k} + O\left(\frac{1}{n^{k+1}}\right)$

How can we go from the statement of Hoeffding's Theorem to these? I am in particular having trouble working through simplifying the choose operations.

stats_model
  • 1,184
  • 4
  • 16

1 Answers1

1

Using variance decomposition ($Var(X) = Var(E(X|Y)) + E(Var(X|Y))$), you can show that $\zeta_1 \le ... \le \zeta_m$.

Note that $n \choose m$$^{-1}$ $m \choose k$ $n-m \choose m-k$ = $k!$$m \choose k$$^2\frac{(n-m)(n-m-1)...(n-2m+k+1)}{n(n-1)...(n-m+1)}$, $k=1...,m-1$. The denominator has $m$ terms and the numerator has $m-k$ terms. If $k=1$, the ratio is $1/n+O(1/n^2)$; if $k=2$, the ratio is $1/n^2+O(1/n^3)$ and so forth. This suffices to prove (iii).

Part (ii) is hard and I think induction might be a good point to start.

The second inequality of Part(i) uses that $nVar(U_n)$ is a decreasing sequence (Part(ii)). The first inequality is by the fact that $nVar(U_n)$ converges to $m^2 \zeta_1$ (Part(iii)) and $nVar(U_n)$ is a decreasing.

Statisfun
  • 621
  • 4
  • 13