The explanation on the referenced page is
Under the null hypothesis the probability $\Pr(P \le k / n_\text{sim})$ is exactly $k / n_\text{sim}$ when both the randomness in the data and the randomness in the simulation are taken into account.
To understand this, we must look at the code, of which the key lines (considerably abbreviated) are
fred <- function(x) {ks.test(...)$statistic} # Apply a statistical test to an array
d.hat <- fred(x) # Apply the test to the data
d.star <- apply(matrix(rnorm(n*nsim), n, nsim),
2, fred) # Apply the test to nsim simulated datasets
pval <- (sum(d.star > d.hat) + 1) / (nsim + 1)# Estimate a simulation p-value
The salient problem is that the code does not match the quotation. How can we reconcile them? One attempt begins with the last half of the quotation. We might interpret the procedure as comprising the following steps:
Collect independently and identically distributed data $X_1, X_2, \ldots, X_n$ according to some probability law $G$. Apply a test procedure $t$ (implemented in the code as fred
) to produce the number $T_0=t(X_1, \ldots, X_n)$.
Generate via computer $N=n_\text{sim}$ comparable datasets, each of size $n$, according to a null hypothesis with probability law $F$. Apply $t$ to each such dataset to produce $N$ numbers $T_1,T_2,\ldots,T_N$.
Compute $$P = \left(\sum_{i=1}^N I(T_i \gt T_0) + 1\right) / (N+1).$$
("$I$" is the indicator function implemented by the vector-valued comparison d.star > d.hat
in the code.) The right hand side is understood to be random by virtue of the simultaneous randomness of $T_0$ (the actual test statistic) and the randomness of the $T_i$ (the simulated test statistics).
To say that the data conform to the null hypothesis is to assert that $F=G$. Pick a test size $\alpha$, $0 \lt \alpha \lt 1$. Multiplying both sides by $N+1$ and subtracting $1$ shows that the chance that $P\le \alpha$ for any number $\alpha$ is the chance that no more than $(N+1)\alpha - 1$ of the $T_i$ exceed $T_0$. This says merely that $T_0$ lies within the top $(N+1)\alpha$ of the sorted set of all $N+1$ test statistics. Since (by construction) $T_0$ is independent of all the $T_i$, when $F$ is a continuous distribution this chance will be the fraction of the total represented by the integer part $\lfloor (N+1)\alpha\rfloor$; that is, $$\Pr(P\le \alpha)=\frac{\lfloor(N+1)\alpha\rfloor}{N+1} \approx \alpha$$ and it will be exactly equal to it provided $(N+1)\alpha$ is a whole number $k$; that is, when $\alpha = k/(N+1)$.
This certainly is one of the things we want to be true of any quantity that deserves to be called a "p-value": it should have a uniform distribution on $[0,1]$. Provided $N+1$ is fairly large, so that any $\alpha$ is close to some fraction of the form $k/(N+1) = k/(n_\text{sim}+1)$, this $P$ will have close to a uniform distribution. (To learn about additional conditions required of a p-value, please read the dialog I posted on the subject of p-values.)
Evidently the quotation should use "$n_\text{sim}+1$" instead of "$n_\text{sim}$" wherever it appears.