If $20 $ random numbers are selected independently from the interval $(0,1) $ probability that the sum of these numbers is at least $8$?

Question

If $20 $ random numbers are selected independently from the interval $(0,1) $ what is the probability that the sum of these numbers is at least $8$?

I tried to take this question https://math.stackexchange.com/questions/285362/choosing-two-random-numbers-in-0-1-what-is-the-probability-that-sum-of-them as reference but the step where there is a double integral, I got stuck, do I have to make 20 integrals?

https://en.wikipedia.org/wiki/Irwin%E2%80%93Hall_distribution — Łukasz Deryło, Jul 28 '21 at 11:17
In addition to Łukasz's comment: A normal approximation works well here. — COOLSerdash, Jul 28 '21 at 11:43
Use the methods applied to a closely related problem at https://stats.stackexchange.com/questions/194352. A direct method is to compute the entire distribution of the sum of those $20$ random values; many ways to perform that calculation are presented at https://stats.stackexchange.com/questions/41467. — whuber, Jul 28 '21 at 11:55
While it's perfectly possible to do the calculation, if you're doing an exercise, I expect the intent is probably that you'd use a normal approximation; you're not far into the tail, it should do quite well. Of course, more revealing still would be to do both. — Glen_b, Jul 28 '21 at 12:05
Please suggest edits or policy violation if any , before requesting to close the question — simran, Jul 28 '21 at 13:09
People have kindly suggested several possible approaches but you have not said why they do not help you so it is not clear what you are looking for. — mdewey, Jul 28 '21 at 13:24
@mdewey actually I haven't tried those suggestions yet , I will do and inform the progress as soon as possible . — simran, Jul 28 '21 at 13:29
Please add the [tag:self-study] tag & read its [wiki](https://stats.stackexchange.com/tags/self-study/info). Then tell us what you understand thus far, what you've tried & where you're stuck. We'll provide hints to help you get unstuck. Please make these changes as just posting your homework & hoping someone will do it for you is grounds for closing. — kjetil b halvorsen, Jul 29 '21 at 00:59
@mdewey please have a look and comment , I used clt concept here , In the answer — simran, Jul 29 '21 at 06:32
In addition to the Irwin-Hall distribution and normal approximation, I'll add as a third option that you can estimate this quantity with Monte Carlo methods. — DifferentialPleiometry, Jul 29 '21 at 15:38
Your edited version of the question is thoroughly answered in the thread at https://stats.stackexchange.com/questions/41467 (mentioned in my first comment). — whuber, Jul 30 '21 at 13:14
For reference, an exact (rational) answer can be obtained from [Wolfram Alpha](https://www.wolframalpha.com/input/?i=1+-+%288%5E20+-20*7%5E20+%2B20*19*6%5E20+%2F+2+-20*19*18*5%5E20+%2F+6+%2B20*19*18*17*4%5E20+%2F+24+-20*19*18*17*16*3%5E20+%2F+5%21+%2B20*19*18*17*16*15*2%5E20+%2F+6%21+-20*19*18*17*16*15*14%2F7%21%29+%2F+20%21) as `285575185325803781/304112751022080000`, equal to `0.9390437736202311` in double-precision floating point. (A black-box calculation can be had with `1 - CDF(UniformSumDistribution(20), 8)`.) — whuber, Jul 31 '21 at 16:54

EngrStudent · Accepted Answer · 2021-07-30T14:52:10.530

It can be helpful to have a "gross reality check" (or grc) ((some people call it a sanity check)) that comes at the problem side-ways and can tell you if you are doing something wrong.

Here is R-code to simulate the problem, and give an estimate:

  set.seed(1)
  temp <- numeric(length=20000)
  for(i in 1:20000){
    # y <- sample(c(0,1),20,T)  #(wrong! Thanks @whuber) discrete
    y <- runif(n=20)  # continuous outputs
    
    #is it 8 or more
    temp[i] <- ifelse(sum(y)>=8,1,0)
  }
  mean(temp)

This is what it gives:

> mean(temp)
[1] 0.94265

After 20k trials I would expect the estimate to be within 1% or 0.1% of theoretical result.

Here is a plot of 20 runs, showing convergence and spread of the estimate

Here is the list of the tail value for the runs, and the residual from the ensemble mean:

      mean      err
1  0.94265  0.00324
2  0.94160  0.00219
3  0.93955  0.00014
4  0.94190  0.00249
5  0.93775 -0.00166
6  0.93580 -0.00361
7  0.93840 -0.00101
8  0.93500 -0.00441
9  0.93735 -0.00206
10 0.94030  0.00089
11 0.94160  0.00219
12 0.93965  0.00024
13 0.94005  0.00064
14 0.93810 -0.00131
15 0.93990  0.00049
16 0.93995  0.00054
17 0.93735 -0.00206
18 0.94125  0.00184
19 0.94070  0.00129
20 0.93935 -0.00006

They don't move around much. The standard deviation in those means is ~0.00204, while the ensemble mean is 93.941%

The estimates 93.94% (analytic) and 93.941% (simulated) are ~0.0048 standard deviations apart, which indicates to me that the analytic approach is on the right track.

**This answer is incorrect,** because it samples from the set $\{0,1\}$ rather than the entire interval $(0,1).$ Contrast it with `n = 8)`. — whuber, Jul 29 '21 at 16:39
What is a "gross reality check"? Is that distinct from a [sanity check](https://en.wikipedia.org/wiki/Sanity_check)? — DifferentialPleiometry, Jul 29 '21 at 17:05
@whuber - I always learn from you! My answer has been updated. Thank you for your help. — EngrStudent, Jul 29 '21 at 17:35
@Galen - One of the people who taught me the most (Walt Flom) referred to them as gross reality checks. That term sticks with me. Sanity check is a suitable synonym. — EngrStudent, Jul 29 '21 at 18:38

simran · Answer 2 · 2021-07-29T15:26:44.757

3

Let $ \ X_i $ be the $ \ i^{th}$ number selected where $\ i= 1,2,3,4...20 $

$To $ $ calculate $

$ \ P( \sum_{i=1}^{20} X_i \ge 8 ) $

$ E(\ X_i) = \frac{(0+1)}{2} $ $ [uniform $ $ distribution ] $

$ E(\ X_i) = \frac{1}{2} $

$ E(\sum_{i=1}^{20} X_i) = 20/2 = 10 $

$ Var(\ X_i) = \frac{\ (1-0)^2}{12} $ $ [uniform $ $ distribution ] $

$ Var(\ X_i) = \frac{\ 1}{12} $

$ Var(\sum_{i=1}^{20} X_i) = 20/12 = 5/3 $

$ \ P(\frac{ \sum_{i=1}^{20} X_i - E(\sum_{i=1}^{20} X_i) }{\sqrt Var(\sum_{i=1}^{20} X_i)} \ge \frac {8 -E(\sum_{i=1}^{20} X_i)}{\sqrt Var(\sum_{i=1}^{20} X_i} ) $

$ \ P(\frac{ \sum_{i=1}^{20} X_i - 10) }{\sqrt {5/3}} \ge \frac {8 -10}{\sqrt 5/3} ) $

$ 1- P(Z \le -1.55)$

= $ 0.9394 $ $ approx $

edited Jul 29 '21 at 15:26

answered Jul 29 '21 at 06:30

simran

437
1
14

2

The inequality sign in the second-to-last line should be flipped, I think. – COOLSerdash Jul 29 '21 at 07:32
@coolserdash why , I calculated the right hand side its -1.55 so why would the inequality sign change ?? – simran Jul 29 '21 at 14:03
The standard normal CDF $\Phi(x)$ gives $P(X\leq x)$. So $1-\Phi(x)$ gives $P(X>x)$ which is what you want and calculated. Accordingly, the notation should read $1 - P(Z\leq -1.55)$ which is $P(Z>-1.55)$ (the equality doesn't matter here because it's a continuous variable). – COOLSerdash Jul 29 '21 at 14:58
@coolserdash oh yes thanks – simran Jul 29 '21 at 15:26

score 1 · Answer 3 · edited Jul 29 '21 at 18:39

Here is a histogram of 100,000 simulations each taking the sum of 20 uniform random deviates. Based on this simulation the sum of uniform deviates is well approximated by a normal distribution with an estimated mean of 10.004 and an estimated variance of 1.680. Using the normal approximation the probability that $\sum_{i=1}^n X_i \ge 8$ is $0.94$.

Code follows:

data uniform;
  do sim=1 to 100000;
    do i=1 to 20;
        y=rand('uniform');
        output;
    end;
end;
run;

proc means data=uniform noprint;
by sim;
var y;
output out=out sum(y)=sum;
run;


ods graphics / height=3in width=6in border=no;

proc sgplot data=out;
histogram sum;
density sum / type=normal;
run;

proc means data=out mean var;
var sum;
output out=estimates mean(sum)=mean var(sum)=var;
run;

data estimates;
set estimates;
prob=1-cdf('normal',8,mean,sqrt(var));
run;

proc print data=estimates noobs;
var prob;
run;

Thanks [@EngrStudent](https://stats.stackexchange.com/users/22452/engrstudent)! Does simply adding the phrase "Code follows:" produce the code formatting? — Geoffrey Johnson, Jul 29 '21 at 18:43
No it doesn't. Have a look at https://stackoverflow.com/editing-help for details. — mhdadk, Jul 29 '21 at 18:49

If $20 $ random numbers are selected independently from the interval $(0,1) $ probability that the sum of these numbers is at least $8$?

3 Answers3