2

If one is dealing with small sample sizes, let's say $8-16$ observations per sample, and we are interested in estimates of the standard-deviation (let us also assume Gaussian statistics), is there a reason not to apply sample-size bias correction?

For example the arithmetic sample standard deviation, $$s = \sqrt{\frac{1}{N-1} \sum_{i}^{N} {(x_{i} - \bar{x})^2}} \text{,} $$ is a biased estimator -- especially for small sample sizes. This can be corrected with the $c_4(N)$ bias correction factor. This gives an unbiased estimate of the standard deviation as $$\sigma_{\rm{est}} = s /c_{4}(N) \text{.}$$

There are bias correction factors for small-sample size estimates attained from MLE methods, or the $\rm{MAD}$, $Q_{n}$, and $S_{n}$ dispersion estimators which can also be used to make estimates on the standard deviation.

I ask because if we use an estimate of the standard-deviation to produce a standard error for the mean, i.e. $$\rm{S.E} = \sigma_{\rm{est}}/\sqrt{N}$$ for small sample sizes the correction factor can make quite a difference in the result, so:

  1. Is there a good reason not to account for sample sizes, especially when they are small?
  2. Does including bias correction for estimates which are used to produce standard errors and/or weights, have any undesirable consequences?
Q.P.
  • 248
  • 1
  • 13
  • 1
    Aren't these questions thoroughly addressed at https://stats.stackexchange.com/questions/3931 ? – whuber Jan 08 '21 at 14:11
  • 1
    I don't think so. These refer to Bessel's correction, but if you have small samples sizes you seem to need more, i.e. $c_{4}$. If I shouldn't be using $c_{4}$, why do we have this correction factor at all? For diagnostics of the estimators performance? – Q.P. Jan 09 '21 at 06:51

1 Answers1

2
  1. If you were going to make a confidence interval or hypothesis test, the sample size is accounted for in the degrees of freedom for the t-distribution. There is a good reason for using the usual estimate of the variance- it is an unbiased estimate of $\sigma^2$. You can't have an estimate that is unbiased for $\sigma^2$ and simultaneously has its square root unbiased for $\sigma$. They had to choose one or the other and they chose to use the estimator that was unbiased for $\sigma^2$.
  2. Yes. If you used an unbiased estimate of the S.E., then you would need to adjust the definition of the t-distribution accordingly. Whatever factor you multiplied the S.E. by to make it unbiased, you would have to divide the t-distribution quantile by the same factor to make them cancel each other out. You would always end up with the same confidence interval and the same p-value for the test statistic as what you would have had without doing any bias-adjustment. It just adds effort and confuses anyone who reads what you did.
John L
  • 2,140
  • 6
  • 15