11

Consider a random sample $\{X_1,X_2,X_3\}$ where $X_i$ are i.i.d. $Bernoulli(p)$ random variables where $p\in(0,1)$. Check if $T(X)=X_1+2X_2+X_3$ is a sufficient statistic for $p$.

Firstly, how can we find the distribution for $(X_1+2X_2+X_3)$? Or should it be broken down to $X_1+X_2+X_2+X_3$ and then will this follow $Bin(4,p)$? I think not because note that all the variables are not independent here.

Alternately, if I employ the factorization condition by just considering the joint p.m.f. of $(X_1,X_2,X_3)$ then $f(X_1,X_2,X_3)=p^{x_1+x_2+x_3}(1-p)^{3-(x_1+x_2+x_3)}=[p^{t(x)}(1-p)^{3-t(x)}]p^{-x_2}(1-p)^{x_2}$ where $t(x)=x_1+2x_2+x_3$.

This shows that $T$ is not sufficient.

But what if I want to follow the definition and want to apply $\dfrac{f(X|p)}{g(T(X)|p)}$ to check if this ratio is independent of $p$? Then I need to know the distribution of $g$. What then, is the distribution of $T(X)=X_1+2X_2+X_3$?

Xi'an
  • 90,397
  • 9
  • 157
  • 575
Landon Carter
  • 1,295
  • 11
  • 21
  • 1
    Hint: You don't need to know the full distribution of $T(X)$. Consider, for instance, the case $T(X)=2$: what is the conditional probability distribution of $(X|T(X)=2)$? – whuber Dec 31 '14 at 18:19
  • If $T(X)=2$ then $(X_1,X_2,X_3)\in\{(1,0,1),(0,1,0)\}$. So $P(X|T(X)=2)=p^2(1-p)+p(1-p)^2=p(1-p)$ which is dependent on $p$, correct? – Landon Carter Dec 31 '14 at 18:28
  • 1
    That's the right idea--but I don't see why you are adding the two probabilities. Isn't $X$ a *vector*? (If you like, you can use the same kind of calculations to find the full distribution of $T(X)$ (it can only attain the values $0,1,2,3,4$), but that's no longer necessary, is it?)) – whuber Dec 31 '14 at 18:29
  • Yeah, right. Thanks! So once we show that this ratio is not independent of $p$ for at least once sample, then we are done! Thank you. And HAPPY NEW YEAR :) – Landon Carter Dec 31 '14 at 18:31
  • Yes $X$ is a vector, but more importantly $X=(X_1,X_2,X_3)$ and the probability $P(X|T(X)=2)=P(T(X)=2)=P(X=(1,0,1))+P(X=(0,1,0))$. Please correct me if I am wrong. – Landon Carter Dec 31 '14 at 18:35
  • How about a revised look at this? Taking your hint, if I show that for the sample $x=(1,0,1)$ the factorization theorem does not establish that $T$ is a sufficient statistic, maybe I will be done. So, $\dfrac{P(X=(1,0,1))}{P(T(X)=T(1,0,1)=2)}=\dfrac{p^2(1-p)}{p^2(1-p)+p(1-p)^2}$ (by the logic in the above comment) $=\dfrac{p^2(1-p)}{p(1-p)}=p$ which is NOT independent of $p$ for which we conclude that $T$ is not sufficient. – Landon Carter Dec 31 '14 at 18:49
  • Why not post that as an answer? That will encourage the community to look it over and provide some feedback. – whuber Dec 31 '14 at 19:00
  • Yes, added. Basically the possible hint I got from you is what is reflected in the answer. – Landon Carter Dec 31 '14 at 19:09

1 Answers1

11

I had a discussion with "whuber" and maybe I got a (correct?) hint to look at any sample point: evaluate $\dfrac{P(X=x)}{P(T(X)=T(x))}$ at that sample point $x$ and check if this ratio is independent of the parameter, in this case $p$.

So take $x=(1,0,1)$ then $T(1,0,1)=2$. So we evaluate $\dfrac{P(X=(1,0,1))}{P(T(X)=2)}$.Now, $$T(X)=2 \text{ iff } X\in\{(1,0,1),(0,1,0)\}.$$ Due to the i.i.d. property, $$P(X=(1,0,1))=p^2(1-p)\text{ and }P(X=(0,1,0))=p(1-p)^2.$$ Also $$P(T(X)=2)=P(X=(1,0,1))+P(X=(0,1,0))=p(1-p).$$

Hence $$\dfrac{P(X=(1,0,1))}{P(T(X)=2)}=\dfrac{p^2(1-p)}{p(1-p)}=p$$ which is clearly dependent on $p$, and therefore $T$ is not a sufficient statistic.

whuber
  • 281,159
  • 54
  • 637
  • 1,101
Landon Carter
  • 1,295
  • 11
  • 21