2

As an exercise I decided to check if unbiased estimator of standard deviation of sample is giving better results than biased estimator . So far it looks that only in aprox 55% cases. Am I doing something wrong or this is normal?

My methodology:

  • generate sample of 100 numbers from range 1 to 1000
  • 10 000 times choose 10 numbers from above
  • for each 10 numbers calculate biased and unbiased estimator of standard deviation
  • check how many times unbiased estimator was closer to standard deviation in population (in comparison to biased estimator).

My code:

import numpy as np
import random
from math import sqrt

# Lets generate set of 100 random numbers
myarray = np.random.randint(1,1000,100)
myarray_sd = np.std(myarray)

right = 0
wrong= 0

for k in range(10000):
    # lets choose sample of numbers
    sample= random.sample(set(myarray),10)

    # calculate mean
    sample_mean=np.mean(sample)

    # calculate sd for n and n-1
    sample_sd_n = sqrt(sum((sample-sample_mean)**2)/len(s))
    sample_sd_n1 = sqrt(sum((sample-sample_mean)**2)/(len(s)-1))

    # callculate diffrences between both sd and sd in population
    res_n= abs(sample_sd_n - myarray_sd)
    res_n1=abs(sample_sd_n1 - myarray_sd)

    # check if std calculated using n-1 is more accurate
    if res_n1<res_n:
        right +=1
    else:
        wrong +=1

print ('The theory is correct in: %f cases' % (round(right/(right+wrong),2)))
Ferdi
  • 4,882
  • 7
  • 42
  • 62
michalk
  • 125
  • 4
  • 6
    [Neither of these estimators is unbiased.](https://stats.stackexchange.com/questions/11707) Comparing them to one another doesn't seem to have anything at all to do with bias, anyway. – whuber Oct 08 '17 at 22:00
  • What is your population distribution? Is it uniform on 1-1000 or something else such as uniform over the selected 10? How are you computing the sample variance (i.e. what is N? What is the population variance? Unbiasedness means that the sample variance averages in large samples the population variance. It say nothing about how often it is closer to the population variance than some other estimate. – Michael R. Chernick Oct 08 '17 at 22:15

0 Answers0