I am running a simulation in R to illustrate how with repeated testing and no fixed sample size, you are guaranteed to reject the null, even when it is true.
- Start with two draws from a $N(0,1)$ distribution
- Run a t-test on the data set with the null hypothesis being $\mu = 0$
- If the p-value is less than 0.05, return the number of samples in the data
- Otherwise, add another sample from $N(0,1)$ to the dataset and repeat.
Here's the code.
generate <- function(){
data <- rnorm(n=1, mean = 0, sd=1)
p <- 1
while(p > 0.05)
{
data <- append(data, rnorm(n=1, mean = 0, sd=1))
p <- t.test(data)$p.value
}
return(length(data))
}
Sometimes I get a small number (between 10 and 1000), sometimes I get a larger number (~50,000) and other times the functions seems to loop forever.
Is there an analytic solution to understand this distribution of sample size?