1

I have a question regarding the binomial distribution. I want to calculate the p from the empirical choices during my treatment in order to calculate the standard deviation for my binomial random values (1,0). I have 240 outcomes (1,0) and I know the mean of this outcome but i am missing p to calculate std.

Is there also another way to calculate Std in R instead of using the binomial standard deviation formula? I was thinking of using the estimated variance of a LOGIT model, is that a possibility too?

Thank you very much!

2 Answers2

1

When you code a Bernoulli-distributed variable as 0 and 1, as you have, then the maximum likelihood estimate of $p$ is equal to the sample mean, which you already have. You can also calculate the sample standard deviation (SD) directly from the definition of the SD.

Kodiologist
  • 19,063
  • 2
  • 36
  • 68
  • Thank you very much for your input. However, I don't understand the last sentence. What do you eyactly mean by calculatimf the sample SD directly from the definition of the SD. What do you mean by definition? Thx – Matthias Van Herrmann May 18 '18 at 21:45
  • @MatthiasVanHerrmann The definition of the sample SD is $\sqrt{\frac{\sum (X - \bar X)^2}N}$. – Kodiologist May 19 '18 at 07:10
  • Are you sure that I can use the definition of sample SD for a binomial distribution due to the fact that there exists a formula for the binomial SD. – Matthias Van Herrmann May 19 '18 at 11:51
  • @MatthiasVanHerrmann Yes. The formula in question is correct because it is implied by the definition. – Kodiologist May 19 '18 at 15:32
  • @MatthiasVanHerrmann You're welcome. If I answered your question to your satisfaction, you can accept my answer by clicking the check mark under the voting arrows. – Kodiologist May 19 '18 at 19:56
1

A proportion does not contain information that can be directly plugged into the 'definition' of a standard deviation, but a commonly used approximate method for calculation of a confidence interval for binomial proportions includes a formula that can be used as an approximation of the standard deviation. See here: Variance of sample proportion decreases with n but of a count increases with n - why?

Note, however, that the approximate confidence intervals often have pretty bad coverage. I have argued that different methods should be used. See here: Putting a confidence interval on the mean of a very rare event

Michael Lew
  • 10,995
  • 2
  • 29
  • 47
  • Hello. Thanks for your reply. I have read both of the links you sent me. I cannot find the formula for a confidence interval for binomial proportions. Only the binomial SD formula, where I would need p, which I don't have. How can I derive p from my data? Or shall I take Var(X)/n^2. How can I calculate Var(X)? I am really confused how to plug in those numbers for the formula. Thx – Matthias Van Herrmann May 19 '18 at 11:49
  • The p used in the formula is the observed proportion of successes. – Michael Lew May 19 '18 at 21:31