13

I am looking for the distribution of a random variable $Z$ defined as

$$Z = \sqrt{X_1+\sqrt{X_2+\sqrt{X_3+\cdots}}} .$$

Here the $X_k$'s are i.i.d. and have same distribution as $X$.

1. Update

I am looking to find a simple distribution for $X_k$, that results in a simple distribution for the nested square root $Z$. Thus my idea to investigate distributions stable under some particular transformations. But this may not be the easiest way.

I tried a Bernoulli (with parameter $\frac{1}{2}$) for $X_k$, but this leads to some very difficult, nasty stuff, and a distribution on $[1, \frac{1+\sqrt{5}}{2}]$ full of gaps - some really big - for $Z$. So far the most promising result is the following.

Use a discrete distribution for $X_k$, taking on three possible values $0, 1, 2$ with the probabilities

  • $P(X_k = 0) = p_1$
  • $P(X_k = 1) = p_2$
  • $P(X_k = 2) = p_3 = 1-p_1-p_2$.

Now the resulting domain for $Z$'s distribution is $[1, 2]$, and the gaps are eliminated. The resulting distribution is still very wild, unless $p_1, p_2, p_3$ are carefully chosen. Consider

  • $p_1=\sqrt{5\sqrt{2}-1}-2$,
  • $p_2=\sqrt{5\sqrt{3}-1}-\sqrt{5\sqrt{2}-1}$,
  • $p_3=3-\sqrt{5\sqrt{3}-1}$.

I was naively thinking that this would lead to $Z$ being uniform on $[1, 2]$, based on the table featured in my article Number Representation Systems Explained in One Picture (published here, see column labeled "nested square root", with row labeled "digits distribution".) But $Z$ does not appear to be uniform, though it does appear to be well behaved: it looks like $F_Z(z)$ is a polynomial of degree 2 if $z\in [1, 2]$. Then I modified a bit the values of $p_1, p_2, p_3$, removing 0.02 to $p_1$ and adding 0.02 to $p_3$. The result for $Z$ looks much closer to uniform on $[1, 2]$ this time.

Anyway, that's where I am now. My re-formulated question is: with appropriate values for $p_1, p_2, p_3$ (and what would these values be?) can we have a simple distribution for $Z$? (uniform or polynomial on $[1,2]$)

Note: With the particular discrete distribution in question, the support domain for $Z$ is $[1, 2]$. Sure, if all $X_k$ are zero, then $Z=0$ but that happens with probability zero. If all but one of the $X_k$ is zero, then $Z\geq 1$.

2. Second update

Regarding my statement I was naively thinking that this would lead to $Z$ being uniform on $[1, 2]$. I think the reason that it doesn't is because for this to happen, the $X_k$'s would need to have the right auto-correlation structure required to form a normal number in the numeration system based on infinite nested radicals. In my experiment, I used i.i.d. $X_k$'s. But for normal numbers (in that system) lag-1 auto-correlation between successive digits (the $X_k$'s being the digits) is close to zero, but not exactly zero. By contrast, in the binary numeration system, the digits $X_k$'s of normal numbers are not correlated, and thus if $X_k$ is Bernouilli of parameter $p=\frac{1}{2}$, then $Z = \sum_{k=0}^\infty X_k \cdot 2^{-k}$ is uniform on $[0, 1]$. But if $p\neq \frac{1}{2}$, then the distribution of $Z$ is pretty wild, see here.

3. Third update

Assume the $X_k$'s are i.i.d. with the discrete distribution mentioned earlier. Then the density $f$ associated with $Z$, if it exists, must satisfy:

  • $z \in ]1,\sqrt{2}[\Rightarrow f(z) = 2p_1 z f(z^2)$

  • $z \in ]\sqrt{2},\sqrt{3}[\Rightarrow f(z) = 2p_2 z f(z^2-1)$

  • $z \in ]\sqrt{3},2[\Rightarrow f(z) = 2p_3 z f(z^2-2)$

This excludes the possibility that $Z$'s distribution is as simple as a finite polynomial, regardless of $p_1, p_2, p_3$. Also, at $z=1, \sqrt{2}, \sqrt{3}$ or $2$, $f(z)$ may be zero, infinite, not exist or be discontinuous.

Finally, if $f(z)$ is properly defined (not zero or infinite) at $z=(1+\sqrt{5})/2$, then we have $p_2 = 1/(1+\sqrt{5})$: this is a direct result of the second equation in the above mathematical formula. Using the same equations with $z=\sqrt{2}$ and $z=\sqrt{3}$ yields $p_2/p_1=p_3/p_2$, if $f(1)$ and $f(2)$ are well defined. Combined with the value for $p_2$ and the fact that $p_1+p_2+p_3 =1$, we easily obtain interesting values: $p_1 = 1/2, p_2 = 1/(1+\sqrt{5}), p_3= (3-\sqrt{5})/4$. The following of this section is split into three cases.

Case 1:

If $p_1 = 1/2$ and $f(1)$ is well defined, one would assume that if $z \in ]1,\sqrt{2}[$ and the density is continuous, then $f(z) = f(1) / z$, because of the first formula resulting in

$$f(z) = f(\sqrt{z})/\sqrt{z} = f(z^{1/2^n})\cdot\Big(z^{\frac{1}{2}+\frac{1}{2^2}+\cdots +\frac{1}{2^n}}\Big)^{-1} \rightarrow \frac{f(1)}{z}.$$

Case 2:

The case $z\in ]\sqrt{2},\sqrt{3}[$ is quite interesting. Let's use $p_2 = 1/(1+\sqrt{5})$ and let $\phi = 2p_2$. Also, let us define $$R_1(z) =\sqrt{1+z}, R_2(z) =\sqrt{1+\sqrt{1+z}},R_3(z) =\sqrt{1+\sqrt{1+\sqrt{1+z}}}$$ and so on. Using the formula $f(z) = \phi\cdot\sqrt{1+z}\cdot f(\sqrt{1+z})$ iteratively, one gets $$f(z)=f(R_n(z))\cdot\phi^n\cdot\prod_{k=1}^n R_k(z).$$ The expression on the right-hand size converges as $n\rightarrow\infty$. Note that $R_n(z) \rightarrow \phi^{-1}$.

Note that if $z\in ]2^{1/4}, 3^{1/4}[$ then $f(z)$ can be computed either using case 1, or as follows: $f(z) = 2p_1 z f(z^2)$ and since $z^2 \in ]\sqrt{2}, \sqrt{3}[$ you can compute $f(z^2)$ using case 2. If the two different methods produce different results, the likely explanation is that $f(1)$ does not exist: $f$ oscillates infinity many times around $z=1$, making case 1 useless. This is something I have yet to explore.

Case 3:

Here $z\in ]\sqrt{3},2[$. I haven't checked it yet.

  • 1
    The lognormal distribution satisfies your first requirement (but not the second one). – Jarle Tufto Oct 18 '19 at 20:36
  • 1
    So as Bernoulli – gunes Oct 18 '19 at 20:37
  • I'll start with Bernoulli for $X_k$ and see what I get for $Z$. Should not be too hard I think, and quite exciting.Of course in that case, $0\leq Z \leq (1+\sqrt{5})/2$. – Vincent Granville Oct 18 '19 at 20:42
  • 4
    Generalized gamma also works for the first requirement and in some special cases, the second. – soakley Oct 18 '19 at 20:44
  • I am open to distributions with infinite variance too. Looking at the characteristic functions (what they must satisfy) in order for this to work nicely. – Vincent Granville Oct 18 '19 at 20:53
  • @Artem: Convergence in distribution. It's OK if the limiting distribution does not belong to the same family. I'd like to work with some "nice (easy) setting" just like Bayesian statisticians are interested in distributions that are stable under some transformations (prior ~ posterior.) – Vincent Granville Oct 18 '19 at 21:12
  • 4
    It depends what you call a family of distributions. If this family contains all probability distributions, this is always true. – Xi'an Oct 19 '19 at 12:42
  • Weibull distributions satisfy the title requirement but are not stable by addition. – Xi'an Oct 19 '19 at 12:52
  • 2
    You can create families of such distributions by closing any set of distributions under this operation. That reduces the question to characterizing those families; especially, to showing there are some "interesting" ones. As an example of where this can lead, one such family consists of all (necessarily discrete) distributions supported on the non-negative algebraic numbers whose degrees are powers of two. This, however, does not satisfy the more restrictive limiting condition imposed on $Z$ in the question. It does raise a question, though: are you looking for a finitely parameterized family? – whuber Oct 19 '19 at 13:30
  • @Whuber: I updated my question, and I am now focusing on a discrete distribution that seems to have potential. See the section "Update". – Vincent Granville Oct 19 '19 at 16:23
  • for sure support should be $R^+$ – quester Oct 19 '19 at 17:15
  • @Quester: For the support, the max value is when $X_k=2$ for all $k$, then $Z=2$. The min value is when $X_k =0$ for all but one $k$ and the only non-zero $X_k$ is equal to one. Then $Z=1$. The value $Z=0$ is a singularity. – Vincent Granville Oct 19 '19 at 17:19
  • Could you explain how "the resulting domain for $Z$'s distribution is $[1,2]$"? – whuber Oct 19 '19 at 19:01
  • @Whuber: I sampled 10,000 values of $Z$, each based on 2,000 i.i.d. $X_k$ with the discrete distribution in question. You can access it, as well as the Perl code, at http://datashaping.com/nsr.txt. Min value is 1.000028749, max is 1.999912237, and there is a continuum of values between these two extremes. (the value $Z=0$ is a singularity happening with probability zero only if all $X_k$'s are zero, the next value higher than zero is $Z=1$, there is nothing in-between). – Vincent Granville Oct 19 '19 at 19:11
  • Could you explain how you accomplished such a feat when it's necessary to sample a countable infinity of $X_i$ to obtain one realization of $Z$? One would suppose you used some kind of shortcut based on an analysis of this process. My main concern is that this process does not converge almost surely, whence $Z$ is undefined almost surely. – whuber Oct 19 '19 at 19:16
  • 1
    @Whuber: Using 2,000 $X_k$ deviates for each $Z$ is more than enough to produce great accuracy. One way to check this is keeping the same first 2,000 $X_k$'s and adding another 10,000 (or one million) to create a deviate for $Z$: it does not change the first 10 digits or so of the sampled $Z$ value. Also, my interest is convergence in distribution. I plotted the empirical percentile distribution for $Z$ and it is converging (no formal proof, just empirical evidence.) – Vincent Granville Oct 19 '19 at 19:41
  • @Whuber: I added a second update; it does not directly address your question, but I believe it will make my stuff easier to quickly understand, by relating to material that people are familiar with. – Vincent Granville Oct 19 '19 at 20:03

1 Answers1

1

My answer has three parts. Part 1 is related to using the discrete distribution investigated earlier for $X_k$. Part 2 is related to finding a family of distributions to meet the requirements of the original question. Part 3 is a generalization to nested cubic roots and continued fractions.

Part 1: using the discrete distribution for $X_k$

Using the discrete distribution discussed earlier for $X_k$ (that is, with $p_1=1/2$, $p_2 = 1/(1+\sqrt{5})$ and $ p_3 = (3-\sqrt{5})/4$) then $Z$'s distribution is much smoother than with various other combinations of $p_1, p_2, p_3$. Yet it is deeply chaotic in the sense that it might be differentiable nowhere. In short, $f(z)$ seems to be defined nowhere, and formulas based on limits as in case 1 and case 2, make no real sense.

The density $f_Z$ may not exist, but the distribution $F_Z$ does. It clearly has three legs: $z \in [1, \sqrt{2}]$, $z \in [\sqrt{2}, \sqrt{3}]$ and $z \in [\sqrt{3}, 2]$. Based on case 1 that suggests $f_Z(z) \propto 1/z$ if $z\leq \sqrt{2}$, I decided to compute the integral to get a "best bet" for $F_Z(z) = P(Z<z)$, resulting in $F_Z(z) \propto \log z$.

Even though that step does not really make sense (since $f_Z$ does not exist), it yields a very good approximation for $F_Z$. Indeed, $F_Z(z)$ is very well approximated by $\log_2 z$, especially if $z \in [1,\sqrt{2}]$. The picture below shows $F_Z(z)$ in blue, and its approximation by $\log_2 z$ in red. The X-axis represents $z$, the Y-axis $F_Z(z)$.

enter image description here

The chart below shows the approximation error $E(z) = F_z(z) - \log_2 z$. Note that the error is maximum at $z = (1+\sqrt{5})/2$. Notable local minima for $E(z)$ include (among infinitely many others) $z=1, 2^{1/4}, \sqrt{2}, \sqrt{3}$ and $z=2$. Also, the curve below seems to be differentiable nowhere, indeed it has some of the patterns of a Brownian motion. In particular, one can see a fractal behavior, with the successive double-bumps (followed and preceded by a big dip all the way down to $E(z)=0$) repeating themselves over time but being amplified as $z$ increases. The maximum attained at each double-bump seems to be exactly 2 times the maximum reached at the previous double-bump.

enter image description here

Furthermore, it seems that the median is $\sqrt{2}$, though I haven't checked. Now if you switch the values of $p_1$ and $p_3$, then it looks like the median becomes $\sqrt{3}$. And if $p_1=p_2=p_3 = 1/3$ (a very chaotic case), it looks like the median becomes $(1+\sqrt{5})/2$.

Part 2: finding $X$ and $Z$ using characteristic functions

This is still a work in progress, but the idea is as follows. If $\phi_2$ is the characteristic function (CF) of $Z^2$, $\phi_1$ is the CF of $Z$, and $\phi$ is the CF of $X_k$, and if $\phi = \frac{\phi_2}{\phi_1}$, then the distribution of the nested square root of the $X_k$'s is also the distribution of $Z$.

The idea is to first find some $Z$ (that is, $\phi_2$ and $\phi_1$), compute the ratio of the two CF's. If this ratio is the CF of some distribution $X$, then we solved the problem (in a backward way, by specifying the limit $Z$ first, and then finding $X_k$.)

Note that $Z$ can not have a log-normal distribution unfortunately, because $Z$ can not be lower than 1 (prove it, this is an easy exercise.) A potential candidate for $Z$'s distribution is uniform on $[1, 2]$, or log-log-normal, that is $\log\log Z$ is normal.

Below is a chart based on $X$ being log-normal (see here for more.) It looks like $\log \log Z$ is almost normal, but it is not exactly normal.

enter image description here

Perhaps the easiest solution is considering $f_z(z) = \frac{2}{3} z$ with $z \in [1,2]$. Then $\mbox{CF}(Z) =\frac{2}{3}\int_1^2 z \exp(i t z)dz$ and $\mbox{CF}(Z^2) =\frac{2}{3}\int_1^2 z \exp(i t z^2)dz$. These two CF's are easy to compute and result in $$\mbox{CF}(X) = \frac{it}{2}\cdot\frac{e^{3it}-1}{e^{it}(1-2it)+it -1}.$$ But is the latter really a CF? It does not appear to be bounded. And is the support domain for $X$ equal to $[0, 2]$ as expected?

Part 3: Generalization to nested cubic roots and continued fractions

This can be generalized to nested cubic roots or continued fractions as follows. Consider $Z_{k+1}=(X_k + Z_k)^{\alpha}$ with $Z=\lim_{k\rightarrow\infty} Z_k, Z_0=0$ and the $X_k$'s are i.i.d.Then we have $\phi = \frac{\phi_\alpha}{\phi_1}$ where $\phi$ is the CF of $X_k$, $\phi_1$ is the CF of $Z$, and $\phi_\alpha$ is the CF of $Z^{1/\alpha}$. The most popular cases are:

  • $\alpha = 1/2$: Nested square roots,
  • $\alpha = 1/3$: Nested cubic roots,
  • $\alpha = -1$: continued fractions.