Intuition behind the Stein's paradox

Question

I had read wiki and some sources. jmanton's blog, Wasserman's blog

the background is that:

You have Xi ∼ N(θi, 1), and we want to estimate the each θi.
Where Xi are independent。

For MSE risk of vector (X1,...,Xn), when n>=3.

the MLE estimator $\hat θ_i = x_i$ is inadmissable .
While James-Stein estimator is better than MLE in such case.

Looking from Shrinkage

quote from Larry Wasserman:

Note that the James-Stein estimator shrinks {X} towards the origin. (In fact, you can shrink towards any point; there is nothing special about the origin.)

I know the James-Stein estimator is special to MLE for its shrinkage behavior. I still dont get it, the variables are independent.

Why James-Stein estimator shrinkage to arbitrary point can improve the MSE risk versus where MLE do not shrinkage at all ??

and seeing this from Larry Wasserman:

This can be viewed as an empirical Bayes estimator ....

If view it from Empirical Bayes case, it is shrinkage to a overall mean, but that is nonsense if you have independent variable to estimate.

Is there better example or intuitive explanation for it?? THANKS.

Related: [Intuition behind why Stein's paradox only applies in dimensions $\geq 3$](http://stats.stackexchange.com/questions/13494/intuition-behind-why-steins-paradox-only-applies-in-dimensions-ge-3/13647#13647) — cardinal, Jul 27 '14 at 13:14
You might also be interested in the recent paper: R. Beran (2010), [The unbearable transparency of Stein estimation](http://projecteuclid.org/euclid.imsc/1291044739), *IMS Collections*, vol. 7, 25-34. Of course, you'll have to forgive the author somewhat for the choice of title, though the selection was intended, I believe, to make a point. — cardinal, Jul 27 '14 at 13:20

Intuition behind the Stein's paradox

Looking from Shrinkage

0 Answers0