61

Suppose I'm running an experiment that can have 2 outcomes, and I'm assuming that the underlying "true" distribution of the 2 outcomes is a binomial distribution with parameters $n$ and $p$: ${\rm Binomial}(n, p)$.

I can compute the standard error, $SE_X = \frac{\sigma_X}{\sqrt{n}}$, from the form of the variance of ${\rm Binomial}(n, p)$: $$ \sigma^{2}_{X} = npq$$ where $q = 1-p$. So, $\sigma_X=\sqrt{npq}$. For the standard error I get: $SE_X=\sqrt{pq}$, but I've seen somewhere that $SE_X = \sqrt{\frac{pq}{n}}$. What did I do wrong?

Macro
  • 40,561
  • 8
  • 143
  • 148
Frank
  • 1,305
  • 1
  • 12
  • 17
  • 1
    This article is very helpful to understand the standard error of the mean http://influentialpoints.com/Training/standard_error_of_the_mean-principles-properties-assumptions.htm – Sanghyun Lee Jul 08 '18 at 09:23
  • From my googling, it appears that the closely related subject of getting confidence intervals for a binomial distribution is rather nuanced and complicated. In particular, it looks like confidence intervals obtained from this formula, which would be "Wald Intervals" (see https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval), is rather poorly behaved and should be avoided. See https://www.jstor.org/stable/2676784?seq=1#metadata_info_tab_contents for more info. – aquirdturtle Sep 17 '18 at 17:47

4 Answers4

89

It seems like you're using $n$ twice in two different ways - both as the sample size and as the number of bernoulli trials that comprise the Binomial random variable; to eliminate any ambiguity, I'm going to use $k$ to refer to the latter.

If you have $n$ independent samples from a ${\rm Binomial}(k,p)$ distribution, the variance of their sample mean is

$$ {\rm var} \left( \frac{1}{n} \sum_{i=1}^{n} X_{i} \right) = \frac{1}{n^2} \sum_{i=1}^{n} {\rm var}( X_{i} ) = \frac{ n {\rm var}(X_{i}) }{ n^2 } = \frac{ {\rm var}(X_{i})}{n} = \frac{ k pq }{n} $$

where $q=1-p$ and $\overline{X}$ is the same mean. This follows since

(1) ${\rm var}(cX) = c^2 {\rm var}(X)$, for any random variable, $X$, and any constant $c$.

(2) the variance of a sum of independent random variables equals the sum of the variances.

The standard error of $\overline{X}$is the square root of the variance: $\sqrt{\frac{ k pq }{n}}$. Therefore,

  • When $k = n$, you get the formula you pointed out: $\sqrt{pq}$

  • When $k = 1$, and the Binomial variables are just bernoulli trials, you get the formula you've seen elsewhere: $\sqrt{\frac{pq }{n}}$

Macro
  • 40,561
  • 8
  • 143
  • 148
  • OK, very well. Now, I don't understand why we say that the variance of the Binomial is $npq$. I'm missing something between the variance of the Binomial and the variance of the sample, apparently? - Actually: $Var(X) = pq$ when $X$ is Binomial(n,p) (your derivation seems to say that)?? – Frank Jun 01 '12 at 16:36
  • 3
    When $X$ is a _bernoulli_ random variable, then ${\rm var}(X) = pq$. When $X$ has a binomial random variable based on $n$ trials with success probability $p$, then ${\rm var}(X) = npq$ – Macro Jun 01 '12 at 16:48
  • @Frank, I've also edited my answer since you commented - I think this answer is more along the lines of what you were asking. – Macro Jun 01 '12 at 16:48
  • 2
    Thanks! You lifted my confusion. Sorry that it was so elementary, I'm still learning :-) – Frank Jun 01 '12 at 17:02
  • 6
    So is it clear to Frank that we are using the fact that for any constant c Var(cX) =c$^2$Var(x)? Since the sample estimate of the proportion is X/n we have Var(X/n)=Var(X)/n$^2$ =npq/n$^2$ =pq/n and SEx is the square root of that. I think it is clearer for everyone if we spell out all the steps. – Michael R. Chernick Jun 01 '12 at 21:42
  • 1
    @MichaelChernick, I've clarified the details you mentioned. Based on the problem description, I figured that Frank knew these facts but you're right that it would be more educational for future readers to include the details. – Macro Jun 01 '12 at 22:41
  • 1
    @Macro: sorry for the naive question, but I was reading your answer and I am trying to understand it better. I am still not clear on the difference between _k_ (the number of bernoulli trials) and _n_ (the sample size)? When you are doing an experiment, where each trial can have an outcome of 1 or 0, isn't your sample size equivalent to the number of bernoulli trials? In which cases are they not the same? E.g: if I toss a coin 50 times, and I want to calculate the proportion of heads in my experiment, isn't n = k = 50? – Sol May 10 '14 at 13:29
  • 4
    Sol Lago - In this case k=1. If you flipped a coin 50 times and calculated the number of successes and then repeated the experiment 50 times, then k=n=50. A flip of a coin results in a 1 or 0. It is a Bernoulli r.v. – B_Miner May 10 '14 at 19:35
  • Thank you! My textbook keeps using n for both those things and it confused me so much! – chessprogrammer May 15 '20 at 01:22
  • @B_Miner, could you, please, clarify it once again? A binomial rv comprised of a number of successes in a Bernoulli process (e.g the number of Bernoulli trials). Suppose, we toss a coin 5 times. If k (a number of Bernoulli trials) is equal to 1 in this case, then that means each binomial rv is a Bernoulli rv. If we then calculate the number of successes and after that and repeat the coin toss process 4 more times (5 tosses per process), then we'll have 5 iterations in total. How would that differ from a case when we toss 5 coins at the same time and then repeat that 4 more times? – Viacheslav Moskalenko Jun 09 '20 at 21:44
16

It's easy to get two binomial distributions confused:

  • distribution of number of successes
  • distribution of the proportion of successes

npq is the number of successes, while npq/n = pq is the ratio of successes. This results in different standard error formulas.

Vlad
  • 391
  • 2
  • 7
10

We can look at this in the following way:

Suppose we are doing an experiment where we need to toss an unbiased coin $n$ times. The overall outcome of the experiment is $Y$ which is the summation of individual tosses (say, head as 1 and tail as 0). So, for this experiment, $Y = \sum_{i=1}^n X_i$, where $X_i$ are outcomes of individual tosses.

Here, the outcome of each toss, $X_i$, follows a Bernoulli distribution and the overall outcome $Y$ follows a binomial distribution.

The complete experiment can be thought as a single sample. Thus, if we repeat the experiment, we can get another value of $Y$, which will form another sample. All possible values of $Y$ will constitute the complete population.

Coming back to the single coin toss, which follows a Bernoulli distribution, the variance is given by $pq$, where $p$ is the probability of head (success) and $q = 1 – p$.

Now, if we look at Variance of $Y$, $V(Y) = V(\sum X_i) = \sum V(X_i)$. But, for all individual Bernoulli experiments, $V(X_i) = pq$. Since there are $n$ tosses or Bernoulli trials in the experiment, $V(Y) = \sum V(X_i) = npq$. This implies that $Y$ has variance $npq$.

Now, the sample proportion is given by $\hat p = \frac Y n$, which gives the 'proportion of success or heads'. Here, $n$ is a constant as we plan to take same no of coin tosses for all the experiments in the population.

So, $V(\frac Y n) = (\frac {1}{n^2})V(Y) = (\frac {1}{n^2})(npq) = pq/n$.

So, standard error for $\hat p$ (a sample statistic) is $\sqrt{pq/n}$

Silverfish
  • 20,678
  • 23
  • 92
  • 180
Tarashankar
  • 101
  • 1
  • 2
  • You can use Latex typesetting by putting dollars around your math, e.g. `$x$` gives $x$. – Silverfish Jun 28 '16 at 20:52
  • Note that the step $V(\sum X_i)=\sum V(X_i)$ really deserves some justification! – Silverfish Jun 28 '16 at 20:53
  • There is typo in the last deduction, V(Y/n) = (1/n^2)*V(Y) = (1/n^2)*npq = pq/n should be the correct deduction. – Tarashankar Jun 29 '16 at 02:05
  • Apologies, I introduced that when doing the typesetting. Hopefully sorted now. – Silverfish Jun 29 '16 at 02:45
  • Thank you, sincerely appreciate. Just for your other request, the formula translates to this V(X1 + X2 + ... + Xn) = V(X1) + V(X2) + ... + V(Xn), which is a property of Variance function. Regards and thank you, Tarashankar – Tarashankar Jun 29 '16 at 04:40
  • 1
    That's true if the $X_i$ are uncorrelated - to justify this, we use the fact that the trials are assumed to be independent. – Silverfish Jun 29 '16 at 10:18
4

I think there is also some confusion in the initial post between standard error and standard deviation. Standard deviation is the sqrt of the variance of a distribution; standard error is the standard deviation of the estimated mean of a sample from that distribution, i.e., the spread of the means you would observe if you did that sample infinitely many times. The former is an intrinsic property of the distribution; the latter is a measure of the quality of your estimate of a property (the mean) of the distribution. When you do an experiment of N Bernouilli trials to estimate the unknown probability of success, the uncertainty of your estimated p=k/N after seeing k successes is a standard error of the estimated proportion, sqrt(pq/N) where q=1-p. The true distribution is characterized by a parameter P, the true probability of success. If you did an infinite number of experiments with N trials each and looked at the distribution of successes, it would have mean K=P*N, variance NPQ and standard deviation sqrt(NPQ).

Stan
  • 41
  • 1