2

first post on this website. I have a very basic question on computing a somewhat large computation. The equation to be solved is as follows: $$P(X<262) = \sum_{x=0}^{261} {5236 \choose x}p^{x}(1-p)^{5236-x}$$

I need to solve the RHS (p = 0.03) I haven't done this in a while so I'm a little rusty and not sure if I can simplify this using clever algebra or there are some tools online to solve this kind of routine stuff.

Also I am not sure if the poisson approximation is applicable or even useful. (n = 5236, so np = 157) I do not know R as of now.

Further clarification: If it is not apparent from the question, X is distributed Binomial(n,p) n = 5236, p =0.03 and I am trying to compute the probability that the '# of Heads' in 5236 'tosses' is less than 262.

Thank you for reading this wall of text!! Input much appreciated!!

  • Is "$x$" supposed to be a random variable & the "$X$" on the LHS a different RV? – gung - Reinstate Monica Apr 06 '16 at 15:21
  • If you just want to compute it then you can use the R function pnorm –  Apr 06 '16 at 15:23
  • 1
    I thought i was using standard notation: $$P(X=x) = {n \choose x}p^{x} (1-p)^{n-x}$$ – OctaveParango Apr 06 '16 at 15:42
  • This probability can be represented in closed form as a regularized incomplete Beta function $I_{1-p}(5236-261,1+261)$. See https://en.wikipedia.org/wiki/Binomial_distribution#Cumulative_distribution_function for instance. This relationship is illustrated and explained on our site at http://stats.stackexchange.com/questions/4659 . – whuber Apr 06 '16 at 15:49
  • It is so close to 1 it will make your eyes spin. I get 0.9999999999999954046.... using direct high precision calculation of cumulative distribution function. – Mark L. Stone Apr 06 '16 at 15:55
  • thanks for doing this computation separately!! Always good to have a second opinion when i run this in R to make sure i didn't screw this up – OctaveParango Apr 06 '16 at 16:31

1 Answers1

0

There is no "algebraic trick", you need to compute the sum.

There are functions for doing that in R (pnorm) and python (stats.binom)

What you can do, is approximate with the Gaussian distribution,

The expected value is $\mu=np=5236p$, the std is $\sqrt{np(1-p)}$

The following approximation might be helpful: $$ \Phi(x) \approx \frac{1} {2} \left \{1+ \operatorname{sign} (x)\left [ 1-e^{( -\frac {2} {\pi} x^2) }\right ]^{\frac{1} {2} }\right\} $$

For more approximations, see here

Uri Goren
  • 1,701
  • 1
  • 10
  • 24
  • There are *many* ways to perform this calculation (to arbitrary accuracy) without obtaining the sum. See http://math.stackexchange.com/questions/53925 for a few. *Numerical Recipes* provides a continued fraction expansion that converges (in the worst) case in $\sqrt{\max(5236-261,261+1)}\approx 71$ steps. Such methods become essential once $n$ and the number of terms grow large. Your approximation is mysterious: what is $x$? What does $\Phi$ represent? – whuber Apr 06 '16 at 16:10
  • You are right, it seems the answer to my question is just a 1 liner in R so I'm off to do that, and someone above commented what they got so I have that as a sanity check. – OctaveParango Apr 06 '16 at 16:38