12

Let $X_{1},...,X_{n}$be distinct observations (no ties). Let $X_{1}^{*},...,X_{n}^{*}$denote a bootstrap sample (a sample from the empirical CDF) and let $\bar{X}_{n}^{*}=\frac{1}{n}\sum_{i=1}^{n}X_{i}^{*}$. Find $E(\bar{X}_{n}^{*})$ and $\mathrm{Var}(\bar{X}_{n}^{*})$.

What I have so far is that $X_{i}^{*}$ is $X_{1},...,X_{n}$ each with probability $\frac{1}{n}$ so $$ E(X_{i}^{*})=\frac{1}{n}E(X_{1})+...+\frac{1}{n}E(X_{n})=\frac{n\mu}{n}=\mu $$ and $$E(X_{i}^{*2})=\frac{1}{n}E(X_{1}^{2})+...+\frac{1}{n}E(X_{n}^{2})=\frac{n(\mu^{2}+\sigma^{2})}{n}=\mu^{2}+\sigma^{2}\>, $$ which gives $$ \mathrm{Var}(X_{i}^{*})=E(X_{i}^{*2})-(E(X_{i}^{*}))^{2}=\mu^{2}+\sigma^{2}-\mu^{2}=\sigma^{2} \>. $$

Then, $$E(\bar{X}_{n}^{*})=E(\frac{1}{n}\sum_{i=1}^{n}X_{i}^{*})=\frac{1}{n}\sum_{i=1}^{n}E(X_{i}^{*})=\frac{n\mu}{n}=\mu $$ and $$ \mathrm{Var}(\bar{X}_{n}^{*})=\mathrm{Var}(\frac{1}{n}\sum_{i=1}^{n}X_{i}^{*})=\frac{1}{n^{2}}\sum_{i=1}^{n}\mathrm{Var}(X_{i}^{*})$$ since the $X_{i}^{*}$'s are independent. This gives $\mathrm{Var}(\bar{X}_{n}^{*})=\frac{n\sigma^{2}}{n^{2}}=\frac{\sigma^{2}}{n}$

However, I don't get the same answer when I condition on $X_{1},\ldots,X_{n}$ and use the formula for conditional variance: $$ \mathrm{Var}(\bar{X}_{n}^{*})=E(\mathrm{Var}(\bar{X}_{n}^{*}|X_{1},...,X_{n}))+\mathrm{Var}(E(\bar{X}_{n}^{*}|X_{1},\ldots,X_{n})) \>. $$

$E(\bar{X}_{n}^{*}|X_{1},\ldots,X_{n})=\bar{X}_{n}$ and $\mathrm{Var}(\bar{X}_{n}^{*}|X_{1},\ldots,X_{n})=\frac{1}{n^{2}}(\sum X_{i}^{2}-n\bar{X}_{n}^{2})$ so plugging these into the formula above gives (after some algebra) $\mathrm{Var}(\bar{X}_{n}^{*})=\frac{(2n-1)\sigma^{2}}{n^{2}}$.

Am I doing something wrong here? My feeling is that I am not using the conditional variance formula correctly but I'm not sure. Any help would be appreciated.

cardinal
  • 24,973
  • 8
  • 94
  • 128
rrruss
  • 121
  • 1
  • 5
  • Maybe your V(E(X|X1..Xn)) is not correctly calculated. The answer should be the same. –  Nov 27 '13 at 02:17
  • You're probably right--but this answer doesn't seem terribly informative. Perhaps you could point to which part is not correct? – whuber Nov 27 '13 at 02:21

3 Answers3

5

The correct answer is $\frac{n-1}{n^2}S^2$. The solution is #4 here

mpiktas
  • 33,140
  • 5
  • 82
  • 138
Greg
  • 51
  • 3
4

This may be a late answer, but what is wrong in your calculation is the following: you have assumed that unconditionally your bootstrap sample is iid. This is false: conditional on your sample, the bootstrap sample is indeed iid, but unconditionally you lose independence (but you still have identically distributed random variables). This is essentially Exercise 13 in Larry Wasserman All of nonparametric statistics.

M Turgeon
  • 376
  • 3
  • 10
0

For anyone in the future finding this question: the 2nd variance value computed from the conditional variance formula (at the bottom of the question) is correct. The first value is incorrect.

The answer above that says "The correct answer is" shows the value of the conditional variance $Var(\bar{X^*_n}|X_1,\dots,X_n)=\frac{n-1}{n^2}S^2$. The unconditional variance is $Var(\bar{X^*_n}) = \frac{(2n-1)\sigma^2}{n^2} = \frac{\sigma^2}{n}\left(2 - \frac{1}{n}\right)$. This can be directly read from the linked source pdf, but wasn't copied correctly to this page.

Indeed, the mistake is that the $\bar{X_i^*}$ are not independent (only conditionally independent), so the first computation of $Var(\bar{X^*_n})$ is incorrect. $Var(\bar{X^*_n}) \neq \frac{1}{n^2}\Sigma_{i=1}^n Var(X_i^*)$.

Alvin
  • 1