Certain interesting probability functions can arise from permutations. For example, permutations that are sorted or permutations that form a cycle.
Inspired by the so-called von Neumann schema given in a paper called "On Buffon machines and numbers" by Flajolet and colleagues (2010), we can describe the following algorithm. To describe it, the following definition is needed:
- A permutation class is a rule that describes how a sequence of numbers must be ordered. The ordering of the numbers is called a permutation. Two examples of permutation classes cover permutations sorted in descending order, and permutations whose highest number appears first. When checking whether a sequence follows a permutation class, only less-than and greater-than comparisons between two numbers are allowed.
The algorithm produces a discrete random variate based on a permutation class. Let $D$ and $E$ be absolutely continuous distributions.
- Create an empty list.
- If the list is empty, generate a random variate distributed as $D$. Otherwise, generate a random variate distributed as $E$. Either way, append the random variate to the end of the list.
- Let $n$ be the number of items in the list minus 1. If the items in the list do not form a permutation that meets the permutation class's requirements, return $n$. Otherwise, go to step 2.
If $D$ and $E$ are both uniform(0, 1), this algorithm returns the number n with the following probability:
$$\eqalign{ G(n)&= (1-\frac{V(n+1)}{V(n)*(n+1)}) * (1-\sum_{j=0}^{n-1} G(j)) \\ &= \frac{V(n)*(n+1)-V(n+1)}{V(0)*(n+1)!}, }$$
Where $V(n) \in (0, n!]$ is the number of permutations of size n that meet the permutation class's requirements. $V(n)$ can be a sequence associated with an exponential generating function (EGF) for the kind of permutation involved in the algorithm. (Examples of permutation classes include permutations whose numbers are sorted in descending order, or permutations whose first number is highest.) For example, if we use the class of permutations sorted in descending order, the EGF is $\exp(\lambda)$, so that $V(n)$ = 1.
For this algorithm, if $D$ and $E$ are both uniform(0, 1), the probability that the generated n—
- Is odd is $1-1/EGF(1)$, or
- is even is $1 / EGF(1)$, or
- is less than $k$ is $\frac{V(0)-V(k)/k!}{V(0)}$.
Thus, for example, if we allow sorted permutations, the algorithm returns an odd number with probability that is exactly $1-\exp(-1)$.
Depending on the permutation class, the distributions $D$ and $E$, and which values of $n$ we care about, different probabilities and different distributions of numbers will arise. For example:
- If the class is sorted permutations, both $D$ and $E$ are the uniform distribution, and given that the return value $n$ is odd, it is known since von Neumann's 1951 algorithm that that number has a truncated exponential distribution.
- If the class is sorted permutations, both $D$ and $E$ are arbitrary distributions, and given that the return value $n$ is odd, then Forsythe (1972) and Monahan (1979) have characterized the distribution function of the sequence's first number.
See the tables in my section "Probabilities Arising from Certain Permutations" for further examples.
For these reasons, it seems to me that this algorithm can open the door to new and exact samplers for continuous and discrete distributions, including new and exact ways to sample certain irrational probabilities. (And I list many of them in "Bernoulli Factory Algorithms".) And this is why I ask the following questions:
For a given permutation class, a given distribution $D$, and a given distribution $E$—
- what is the probability that the algorithm will return a particular $n$?
- what is the probability that the algorithm will return an $n$ that belongs to a particular class of values (such as odd numbers or even numbers)?
- what is the probability that the first number in the sequence is less than $x$ given that the algorithm returns $n$ (or one of a particular class of values of $n$)?
- what is the probability that the last number in the sequence is less than $x$ given that the algorithm returns $n$ (or one of a particular class of values of $n$)?
Note that the third part of the question is equivalent to: What is the CDF of the first number's distribution given that $n$ is returned? Similarly for the fourth part of the question.
REFERENCES:
- Forsythe, G.E., "Von Neumann's Comparison Method for Random Sampling from the Normal and Other Distributions", Mathematics of Computation 26(120), October 1972.
- Monahan, J. "Extensions of von Neumann’s method for generating random variables." Mathematics of Computation 33 (1979): 1065-1069.