I want to find out how many flips I need to flip a coin to reliably know that it is an unfair coin.
The issue is that as the coin becomes closer to 50/50, the more false-negatives you will have if you don't take dramatically more data.
I wrote some python code and numerically found out how many flips are needed to be able to confirm a weighted coin with weight w. As w gets closer to .5, exponentially more flips are needed to have a low false-negative rate, as shown in this figure I made in python:
The x-axis represents the weight of the coin, while the y-axis represents the number of flips to get a false-negative rate of 5% (while also having a 5% false-positive rate).
Is there a way of getting this answer analytically? I'm particularly interested in what approximations can be done to know what will occur as you get closer to .5.
Here's my python code attempt that was used to generate the figure. I have added comments in the code to try to explain the logic:
from scipy import stats
import numpy as np
import matplotlib.pyplot as plt
# The goal of this script is to figure out how many flips are required to
# have a low false-negative rate when testing if a coin is fair for a specific
# p-value. I use this script to understand the relation between
# the unfairness of the coin and the amount of flips needed to get a low false-negative
# rate.
# Essentially we calculate the false-negative rate for some starting number of flips,
# and if this rate is lower than our desired threshold, then we add more number of flips,
# we continue this in a loop until we are under the false-negative rate (or until we run
# out of iterations)
# Finally, we run this all inside the function find_numberofFlips_and_false_negatives_vs_fairness
# this function has an input of the coins fairness, and returns a number of flips as an output
# we then use this function to collect a set of points for the coins fairness vs numberofFlips
# then we plot this
def flipAndDoPTest(numberOfFlips, weight, guessWeight):
"""
Simulate multiple coinflips and perform a binomial test,
returning a pvalue.
Parameters
----------
numberOfFlips : int
The number of flips that will be simulated.
weight : int
The weight of the coin that is flipped
guessWeight : float
The hypothesized probability of success, i.e. the expected
proportion of successes. We will typically use .5 for this.
Returns
-------
result :
pvalue : float
The p-value of the hypothesis test.
"""
flippedCoinSimulation = np.random.binomial(1, weight, numberOfFlips)
numberOfHeads = np.sum(flippedCoinSimulation==1)
numberOfTails = np.sum(flippedCoinSimulation==0)
pvalue = stats.binom_test(numberOfHeads, numberOfFlips, guessWeight)
return pvalue
def accurate_pvalue_interval(numberofFlips, desiredConfidenceLevel):
"""
Creates an new confidence interval which is closest to the desiredConfidenceLevel.
Because of the discrete nature of coin flips, it's not possible to get any arbitrary
confidence level. This code finds the closest confidence level that has a pvalue lower
than the desired confidence level.
Parameters
----------
numberofFlips : int
The number of flips that will be done for the
relevant confidence level.
desiredConfidenceLevel : float
the desiredConfidenceLevel that we will find the nearest
confidence level for
Returns
-------
result :
pValUnderThreshold : float
the nearest valid pvalue that is smaller than the desiredConfidenceLevel
"""
numberofIntervals = np.int(np.round(numberofFlips/2));
intervalIndex = np.arange(numberofIntervals);
intervalList = 2*(stats.binom.cdf(intervalIndex,numberofFlips,.5));
pValUnderThreshold = np.max(intervalList[intervalList <= desiredConfidenceLevel])
return pValUnderThreshold
def false_negative_calculator(numberofFlips, unfairCoinWeight, desiredConfidenceLevel):
"""
loop through each coin and do a simulated p-test to see if for a set number of coinflips
if the given coin is verified to be unfair in a confidence window.
if the coin is a fair coin, and pvalue<=closestConfidenceLevel (meaning it passes the binomial test and
the coin is unfair in a certain confidence window), then we add 1 to the false-positive number
if the coin is an unfair coin, yet is not identified as one, then we add1 to the false-negative numbers
"""
guessWeight = .5
falsePositives = 0
falseNegatives = 0
numberofTrials = 1000
#for our p-test, we want to make sure to pick an interval that fits the discretness of
#the problem. Therefore we input a desired confidence interval, and the function finds
#the nearest confidence interval that can be used.
closestConfidenceLevel = accurate_pvalue_interval(numberofFlips, desiredConfidenceLevel)
#see how many false negatives are obtained in a number of trials.
for ithTrial in range(numberofTrials):
pvalue = flipAndDoPTest(numberofFlips, unfairCoinWeight, guessWeight)
if pvalue>closestConfidenceLevel:
falseNegatives += 1
falseNegativeRate = falseNegatives/numberofTrials
return falseNegativeRate
def find_numberofFlips_and_false_negatives_vs_fairness(unfairCoinWeight):
numIterations = 1000
#maximum number of iterations used in the loop
numberofFlips = 10
#initial starting number of flips used in the first iteration of loop
flipNumberIncrement = 50
#how much the flip number increases by each iteration in loop
numberofStandardDeviationSeparation = 4
#decides the amount of standard deviations the unfair coin is from the fair coin that we finding the
#number of flips for
falseNegativeThreshold = .05
desiredConfidenceLevel = .05
#Loop explained in the function deviation_separation_calculator()
for stepi in range(numIterations):
falseNegativeRate = false_negative_calculator(numberofFlips, unfairCoinWeight, desiredConfidenceLevel)
if falseNegativeRate <= falseNegativeThreshold:
print("Completed in", stepi, "total increments and requires", numberofFlips, "number of flips", "with a false negative rate of:", round(falseNegativeRate*100, 4), "%")
break
else:
numberofFlips += flipNumberIncrement
if falseNegativeRate > falseNegativeThreshold:
print("Could not find solution in the given number of iterations:", numIterations)
return numberofFlips
weightList = np.array([.9, .8, .7, .6, .55, .54, .53, .525, .520, .515, .510])
#list of weights of the unfair coin; as this gets closer to .5 the more numberofFlips will be needed for
#our condition to be satisfied
numofFlipsList = np.arange(len(weightList))
for i in range(len(weightList)):
numofFlipsList[i] = find_numberofFlips_and_false_negatives_vs_fairness(weightList[i])
print("For fairness weight of:", weightList[i], "number of flips required is",numofFlipsList[i])
plt.plot(weightList, numofFlipsList)