2

What I know:

with R as a random variable from a discrete uniform distribution of 1000 numbers [1, 1000]. there is a 1/1000 chance to have R=123 (or any other number in [1, 1000])

What I think I know:

so if we test this 1000 times, there must be a "good" chance to see some R=123. (good chance is a probability near 1).

First question: AM I RIGHT ?

Second question: why is the said probability around 0.63 ?

0.63 comes from testing with below algorithm in 3 languages (so if there is a problem with the algorithm or if you think the random generators used in the codes doesn't produce uniform distribution, please point it out)

Maybe, Third question: If the algorithm does not approximate the said probability, please explain what does it approximate and why does it give 0.63

Algorithm:

test:
with a random number NUM from [0, 999]
see if R=NUM at least once in 1000 times

try:
run the test 10000 times.
how many times did the test come true ? divide by 10000.

The codes:

python:

import random
import math

def r():
    return math.floor(random.random()*1000)
def test():
    num = r()
    for i in range(1000):
        if r()==num:
        return True
    return False

q = 0
for i in range(10000):
    if test():
        q += 1
 
print(q / 10000)

js:

let r = () => Math.floor(Math.random()*1000)
let test = () => {
    let num = r()
    for(let i = 0; i < 1000; i ++)
        if(num===r()) return true
    return false
}

let q = 0
for(let i = 0; i < 10000; i ++)
    if(test()) q++

console.log(q / 10000)

cpp:

#include <iostream>
#include <stdlib.h>
using namespace std;

int r() {
    return rand() % 1000;
}

bool test() {
    int num = r();
    for(int i = 0; i < 1000; i ++) {
        if(r()==num) {
            return true;
        }
    }
    return false;
}

int main() {
    srand(1267);

    int q = 0;
    for(int i = 0; i < 100000; i ++) {
        if(test()) {
            q ++;
        }
    }
    cout << ((double)q/100000) << endl;
    
    return 0;
}
```

1 Answers1

2

If you take uniform random values $X_1,...,X_N \sim \text{IID U} \{ 1,...,N \}$ then for any $1 \leqslant k \leqslant N$ you have:

$$\begin{align} \mathbb{P}(\text{At least one } X_i = k) &= 1 - \mathbb{P}(X_1 \neq k,...,X_N \neq k) \\[6pt] &= 1 - \prod_{i=1}^N \mathbb{P}(X_i \neq k) \\[6pt] &= 1 - \prod_{i=1}^N \frac{N-1}{N} \\[6pt] &= 1 - \Big( \frac{N-1}{N} \Big)^N. \\[6pt] \end{align}$$

This gives you the exact probability of the event in question. Now, taking the limit as $N \rightarrow \infty$ gives the limiting probability:

$$\begin{align} \lim_{N \rightarrow \infty} \mathbb{P}(\text{At least one } X_i = k) &= 1 - \lim_{N \rightarrow \infty} \Big( \frac{N-1}{N} \Big)^N \\[8pt] &= 1 - \exp(-1) \\[10pt] &= 0.6321206. \\[6pt] \end{align}$$

Consequently, for large $N$ you have $\mathbb{P}(\text{At least one } X_i = k) \approx 0.6321206$, just as you are finding in your simulation. (The exact probability with $N = 1000$ is $0.6323046$.) The reason that this probability does not converge to one ---as per your intuition--- is that when you increase $N$ you are not only increasing the number of random values being generated, but you are also increasing the number of possible outcomes of those random variables.

Ben
  • 91,027
  • 3
  • 150
  • 376