Intutitive meaning behind the formal definition of sufficient statistic?

Question

According to the definition of sufficiency, a statistic is sufficient for a parameter if the conditional distribution of $X$ given a value of statistic does not depend upon the parameter.

What I am trying to understand is how does conditional distribution of $X$ not being a function of the parameter fit into the intuitive meaning of sufficiency i.e statistic value holding the same amount of information as that of the respective sample.

https://stats.stackexchange.com/q/84226/119261 – StubbornAtom Aug 02 '20 at 18:38 — StubbornAtom, Aug 02 '20 at 18:38

score 1 · Answer 1 · answered Oct 15 '19 at 16:10

1

If the distribution of something you observe does not depend on a parameter, it cannot possibly give you information about it.

Now, if the distribution of $X$ depends on the parameter $\theta$ and the distribution of $X$ given the sufficient statistic $S$ does not, it must be the case that all information about $\theta$ is in $S$; once the value of $S$ is given, the value of $X$ becomes irrelevant, because the conditional distribution of $X$ no longer depends on $\theta$

answered Oct 15 '19 at 16:10

F. Tusell

7,733
19
34

Sir, I didn't get the part where you concluded that all the possible information about parameter is in S.Can you elaborate that part. – Keshavan Purushothaman Oct 16 '19 at 04:44

score 1 · Answer 2 · answered Oct 21 '19 at 12:52

We need some more notation. Suppose the random variable $X$ has a distribution from some family $f(x; \theta)$ parametrized by $\theta \in \Theta$. Suppose that $T=T(X)$ is a sufficient statistic (for $\theta$.) Then by the factorization theorem we have $$ f(x; \theta)= h(x) g(T(x); \theta) $$ where $h$ is a function not depending on $\theta$. Now, using the result from Can the Fisher factorization theorem be understood as a product of densities?, this can be interpreted as a factorization of the distribution of $X$, and we can use this to simulate from the distribution of $X$ by first simulating $T$ and then simulating from the distribution of $X \mid T=t$.

So after having observed $T(x)=t$, we can simulate surrogate data having the same distribution as $X$ by simulating from the above conditional distribution, which by sufficency do not depend on $\theta$. This is a way of giving intuitive meaning to sufficiency; knowing only $T(X)=t$ we can recreate by simulation surrogate data having the same distribution as $X$.

There are other ways to get intuitive meaning to $T$ having the same information content as $X$, via its use in inference. Without going into details

The mle (maximum likelihood estimator) of $\theta$ is a function of $T$ (or if nonunique, can be chosen in such way)
given a prior for $\theta$, the bayesian posterior will be a function of $T$

and there are many more general results of this sort. The two above should be easy exercises.

Intutitive meaning behind the formal definition of sufficient statistic?

2 Answers2

Related