0

I am trying to get the mean and standard deviation for the following sample means with the given sample frequencies for each mean. So in a histogram, the x-axis will show the mean range and sample frequency in the y-axis. The sample size for all samples is n=5. I see the answer is the sample mean = 49.831 and sd= 11.43. However, I am not sure what should the formula be to calculate it as I do not have the information on population standard deviation.

Range for mean  Sample frequencies
0-10             0
10-20            5
20-30            315
30-40            1606
40-50            3105
50-60            2935
60-70            1500
70-80            324
80-90            21
90-100           0
Ripon
  • 3
  • 1
  • 1
    It looks like this is binned data, so the mean is calculated as the mean of a vector in which the center of each bin is repeated as many time as indicated in the frequency column. For the Std. a more sophisticated approach is necessary, which is illustrated in [this answer](https://stats.stackexchange.com/a/68238/61836). – matteo Jan 13 '20 at 22:18
  • The SD estimate of $11.43$ does not use Sheppard's correction. The correction subtracts $10^2/12$ from the variance, resulting in $$\sqrt{11.4263^2-10^2/12}=11.0556$$ for a better estimate. – whuber Jan 13 '20 at 23:00
  • Thank you for your answer. Actually I have used Sheppard's correction as well in R and found variance of 11.43 ```counts – Ripon Jan 13 '20 at 23:14

1 Answers1

0

Welcome to Cross-Validated. :)

Sample mean is $\mu = \frac{\sum_i^n x_if_i}{\sum_i^n f_i}$, and sample variance is $\sigma^2 = \frac{\sum_i^n (x_i-\mu)^2f_i}{\sum_i^n f_i}$, where $f_i$ is the sample frequency of the $i^{th}$ interval, and $x_i = \frac{x^{upper}_i+x^{lower}_i}{2}$ is the average of the upper and the lower limits of the $i^{th}$ interval. You can apply the same formula even if the class intervals are unequal/uneven.

Range   Frequency   Midpoint x Frequency

00-10   0           0
10-20   5           75
20-30   315         7875
30-40   1606        56210
40-50   3105        139725
50-60   2935        161425
60-70   1500        97500
70-80   324         24300
80-90   21          1785
90-100  0           0

Sum:    9811        488895

$\mu=488895/9811 \approx 49.831$. You can try for sample variance also. Sample standard deviation is square root of sample variance.

Gautam Sreekumar
  • 483
  • 5
  • 12
  • Would you mind showing how the calculation of variance works for you? It gives me a very different number. Thank you. – Ripon Jan 13 '20 at 23:18
  • Yeah sure. I have a gist showing the calculation. https://gist.github.com/gautamsreekumar/5cb087be199a44606ec519fbe5e657d0 – Gautam Sreekumar Jan 28 '20 at 20:20