How many times must I roll a die to confidently assess its fairness?

Question

(Apologies in advance for use of lay language rather than statistical language.)

If I want to measure the odds of rolling each side of a specific physical six-sided die to within about +/- 2% with a reasonable confidence of certainty, how many sample die rolls would be needed?

i.e. How many times would I need to roll a die, counting each result, to be 98% sure that the chances it rolls each side are within 14.6% - 18.7%? (Or some similar criteria where one would be about 98% sure the die is fair to within 2%.)

(This is a real-world concern for simulation games using dice and wanting to be sure certain dice designs are acceptably close to 1/6 chance of rolling each number. There are claims that many common dice designs have been measured rolling 29% 1's by rolling several such dice 1000 times each.)

Note that if you are just interested in checking whether 1's are rolled a fair amount of the time, this simplifies the question a lot. — Dennis Jaheruddin, Oct 09 '18 at 11:10
It is important to note that "confidence interval" does not give you a "percentile likelihood of being correct". I suspect that you are using the very reasonable common use of the term "98% sure", but you must know anytime someone mentions "confidence interval" that is not at all the same as a 98% likelyhood: https://link.springer.com/article/10.3758%2Fs13423-013-0572-3 — BrianH, Oct 09 '18 at 13:55
@DennisJaheruddin Thanks Dennis, but I am actually interested in the chance of each side coming up. — Dronz, Oct 09 '18 at 16:03
@BrianH Thank you! I did not just mean the colloquial expression, but am looking to quantify the certainty implied by the test. It seems to me that in the same way that it makes sense to say I expect to roll some die result a calculable percentage of the time, that there would be a similar (but more complex) calculation for how likely I am to roll results within a certain margin of error in I roll n times, which is what I think I understand Xiamoi's answer (and follow-up comment) is saying. Yes? — Dronz, Oct 09 '18 at 16:09
This is a lot trickier than finding the confidence interval for a binomial, since you'd want to keep all probabilities in check. Have a look at [Hsiuying Wang's paper](https://www.sciencedirect.com/science/article/pii/S0047259X07000784) on simultaneous confidence intervals for multinomial distributions (*Journal of Multivariate Analysis* 2008, 99, 5, 896-911). You can find some code in [this blog post](https://blogs.sas.com/content/iml/2017/02/15/confidence-intervals-multinomial-proportions.html), which also gives a quick summary on some of the work that's been done on this. — idnavid, Oct 09 '18 at 00:24
@Dronz I suspect that you are intuitively looking for a Bayesian interpretation, of the form "what is the probability that this specific dice is fair?" A full and proper answer, following that thought, is then "so how many times would I need to roll the dice to get a 98% probability?" - the answer to which is "it depends on how biased the dice is, in what way it is biased, how overall likely you think it is that you'd be dealing with a biased dice, etc", so no easy answer. After all, if you roll a dice 50 times and only ever get a 1 each time, would you be pretty darn sure something was amiss? — BrianH, Oct 10 '18 at 19:43
@Dronz To be fair, this is one of those things that you really think would be more straight-forward than it turns out to actually be. Devilishly tricky, in fact. Here's some key related questions elsewhere to help give you an idea of how there is no incredibly straight-forward answer: Frequentist https://math.stackexchange.com/questions/1578932/fair-die-or-not-from-3d-printer/1580703#1580703 Bayesian https://math.stackexchange.com/questions/1584833/bayesian-approach-is-a-die-from-a-3-d-printer-fair and fun: https://rpg.stackexchange.com/questions/70802/how-can-i-test-whether-a-die-is-fair — BrianH, Oct 10 '18 at 19:46

score 19 · Accepted Answer · edited Oct 09 '18 at 19:51

19

TL;DR: if $p$ = 1/6 and you want to know how large $n$ needs to be 98% sure the dice is fair (to within 2%), $n$ needs to be at least $n$ ≥ 766.

Let $n$ be the number of rolls and $X$ the number of rolls that land on some specified side. Then $X$ follows a Binomial(n,p) distribution where $p$ is the probability of getting that specified side.

By the central limit theorem, we know that

$$\sqrt{n} (X/n - p) \to N(0,p(1-p))$$

Since $X/n$ is the sample mean of $n$ Bernoulli$(p)$ random variables. Hence for large $n$, confidence intervals for $p$ can be constructed as

$$\frac{X}{n} \pm Z \sqrt{\frac{p(1-p)}{n}}$$

Since $p$ is unknown, we can replace it with the sample average $\hat{p} = X/n$, and by various convergence theorems, we know the resulting confidence interval will be asymptotically valid. So we get confidence intervals of the form

$$\hat{p} \pm Z \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$$

with $\hat{p} = X/n$. I'm going to assume you know what $Z$-scores are. For example, if you want a 95% confidence interval, you take $Z=1.96$. So for a given confidence level $\alpha$ we have

$$\hat{p} \pm Z_\alpha \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$$

Now let's say you want this confidence interval to be of length less than $C_\alpha$, and want to know how big a sample we need to make this case. Well this is equivelant to asking what $n_\alpha$ satisfies

$$Z_\alpha \sqrt{\frac{\hat{p}(1-\hat{p})}{n_\alpha}} \leq \frac{C_\alpha}{2}$$

Which is then solved to obtain

$$n_\alpha \geq \left(\frac{2 Z_\alpha}{C_\alpha}\right)^2 \hat{p}(1-\hat{p})$$

So plug in your values for $Z_\alpha$, $C_\alpha$, and estimated $\hat{p}$ to obtain an estimate for $n_\alpha$. Note that since $p$ is unknown this is only an estimate, but asymptotically (as $n$ gets larger) it should be accurate.

edited Oct 09 '18 at 19:51

Cullub

103
4

answered Oct 09 '18 at 00:28

Xiaomi

2,276
6
19

3

Thanks. As I have not done college-type math in decades, could I trouble you to plug in the numbers and actually give me a ballpark number of times I'd need to roll a die, as an integer? – Dronz Oct 09 '18 at 00:40
9

if $p = 1/6$ and you want to know how large $n$ needs to be 98% sure the dice is fair to within 2%, $n$ needs to be at least $n \geq 766$. Ignore my last comment, used incorrect $C_\alpha$. – Xiaomi Oct 09 '18 at 01:40
1

Super, thank you! (So the 1000-roll tests were a reasonable number to use, particularly when done on several dice each. Interesting.) – Dronz Oct 09 '18 at 03:37
4

It might be more interesting to look at the multinomial distribution, since now we test for each side separately. This does not take into account all the information we have on the problem. For an intiuitive explanation look at https://www.stat.berkeley.edu/~stark/SticiGui/Text/chiSquare.htm – Jan Oct 09 '18 at 15:21
5

I agree with @Jan: This answer does not address the question. Moreover, it cannot easily be adapted to construct an answer by applying it separately to all six faces, because the six tests are interdependent. – whuber Oct 09 '18 at 18:56
3

This is a nice answer, but I fully agree with @Jan, whuber. This question deserves an answer based on chi-square statistic and multinomial distribution. – Łukasz Grad Oct 09 '18 at 19:49
@Jan So if I follow, you're saying this answer from Xiaomi gives a formula for the number of rolls to test one of the six sides' chances, but since I want to be 98% sure all of the six sides roll within 2% of 16.67%, that 2% chance the test is misleading would apply to each sides. So the odds that one of the sides is more than 2% off is 2%, for each side, which means 6 2% chances of a misleading result, so you have to roll more times to actually get the chance of error on _any_ side to be below 2%, yes? – Dronz Oct 10 '18 at 22:17
@whuber "this answer does not address the question", did you even read the link? It is specifically about whether $1$'s are more common or not. By isolating the problem to the fairness of the $1$ side, the problem is simplified significantly, and absolutely answers that aspect of the question. – Xiaomi Oct 11 '18 at 01:53
@Xiaomi The link in my question does suggest the 1 is usually the culprit, but I would be interested in testing all of the sides. I think though that even my math can work out the adjustment, using the logic I posted in my comment above. i.e. I think it would be equivalent to adding five more independent chances that there's an error in the accuracy estimate. – Dronz Oct 11 '18 at 06:10
2

@Dronz What I'm saying is that the approach taken here is not very efficient. This procedure has to be repeated for all sides and does not use the interdependence of the sides of the dice. This means that we could test the fairness of the dice with fewer rolls when using the multinomial distribution. We could check for all the sides at the same time without having to do each side separately. – Jan Oct 11 '18 at 10:10
@Jan in the context of dice rolls... I'm not sure the extra effort is worth it. If I understand it, his point was more to check whether the existing experiments had an adequate sample size, and this answer gives him an upper bound. If the sample size here is smaller than that used by them, then so would the "efficient" sample size required under a multinomial framework. Your concern only seems relevant if the sample size they used was lower than the one in this answer, since this upperbound may not be tight, or if the experiment was more complicated than a simple dice roll. – Xiaomi Oct 11 '18 at 10:14
You are providing a procedure that checks how many dice rolls would be necessary and it will definitely work. However the question in my eyes was: "If I want to measure the odds of rolling each side of a specific physical six-sided die to within about +/- 2% with a reasonable confidence of certainty, how many sample die rolls would be needed?" I would answer this with a procedure that provides the lowest bound. Your procedure works but we can do it with less if we apply a more efficient method. – Jan Oct 11 '18 at 10:36
I did not read the link because it should not be necessary: I read the *question* and took it at face value. I am asking that you do that too. Despite its popularity and its acceptance, I maintain that this answer is incorrect. You have to simultaneously check the equidistribution of all six faces, not just the distribution of a single face at a time. – whuber Oct 11 '18 at 14:28
1

In my actual case, I have rolled some specific dice hundreds of times and recorded the numbers rolled. As I understand it, the answer here gives the number of rolls to give a 98% chance my results are within 2% of the actual likelihood that die will roll each side. Since I am interested in each side, what @Jan and others are pointing out here in comments is that I have a 98% for each side, but that means each side has a 2% chance of being off by more than 2%, so the total chance that at least one side is off by more than 2% is more than 98%, so I need more rolls be as certain for all 6 sides. – Dronz Oct 11 '18 at 16:57
2

Dronz, you nailed it: that's exactly the issue. For a die of $d=3$ or more sides, a decent solution is to replace your target of $\alpha=98\%$ confidence by $100(1 - (1-\alpha)/d)\%$ when computing the sample size. With $d=6$ that would be $99.67\%$. The sample size would have to be about 59% larger than what you might expect without this correction. Because that's a sizable difference, I have insisted in earlier comments that this answer is incorrect. – whuber Oct 12 '18 at 23:08

How many times must I roll a die to confidently assess its fairness?

1 Answers1