4

I have a hard time understanding the James-Stein Estimator. I show you how I tried to comprehend it by using a python example. I take a normally distributed random vector with mean $(0.1, 0.2, 0.3, 0.15, 0.11, 0.87)$ and variance 1 for each vector component. Then I sample 20 data points and with these, I calculate the average for each vecor component, which is the maximum likelihood estimate. For the James Stein estimator, I use the formula mystein=(1-((len(mymean)-2)/(myMLE**2).sum()))*myMLE

If the sample size, as in the example below, only rarely in 1000 simulation steps, the James-Stein estimator outperforms the Maximum Likelihood estimator. If I sample only 1 datapoint as in most examples I found, it does. I might fundamentally misunderstand something, but I don't know what.

import numpy as np

check = []
for i in range (0, 1000):
        np.random.seed(i)
        mymean = [0.1, 0.2, 0.3, 0.15, 0.11, 0.87]
        cov = [[1, 0, 0, 0, 0, 0], [0, 1, 0, 0, 0, 0], [0, 0, 1, 0, 0, 0], [0, 0, 0, 1, 0, 0], [0, 0, 0, 0, 1, 0], [0, 0, 0, 0, 0, 1]]

        a, b, c, d, e, f = np.random.multivariate_normal(mymean, cov, 20).T

        myX=np.array([a,b,c,d,e,f])


        myMLE=myX.mean(axis=1)

        mystein=(1-((len(mymean)-2)/(myMLE**2).sum()))*myMLE

        err_y=((myMLE-mymean)**2).sum()
        err_js=((mystein-mymean)**2).sum()

        check.append((err_js<=err_y))

print(sum(check))

Joe_base
  • 105
  • 8
  • 2
    I found my rather trivial mistake. I forgot to divide by 20, if sample size is 20. ```mystein=(1-((len(mymean)-2)/(20*(myMLE**2).sum())))*myMLE ``` I don't delete, since it may be useful for others searching for similar topics. – Joe_base Aug 09 '19 at 17:34

0 Answers0