Transformation of probability distribution

Question

I have a question about a snippet on page 526 in the PRML book of Bishop.

Can someone explain to me why the right-hand side of equation (11.6) equals $z$?

It's unclear to me where this derivation comes from. Thanks for your help!

My first thoughts about this:

Plugging $p(z)=1$, in equation (11.5), we get $p(y)=|dz/dy|$.

Taking the integral of a derivative of $z$ (right-hand side), should give back the original $z$, leading to equation (11.6). However, where do the bounds $-\infty$ and $y$ of the integral come from?

It is mentioned that it is an indefinite integral, why do we have these bounds?

I also didn't take into account that there is an absolute value around $dz/dy$.

Can we just discard this as if it isn't there?

This looks wrong to me. Consider the transformation $z=1-y$ for which $|dz/dy|=1,$ whence (by $(11.5)$) $p(y)=p(z)(1)=1$ when $0\le y\le 1$ (and is $0$ otherwise). For $0\le y\le 1$ and $y\ne 1/2$ equation $(11.6)$ gives $$1-y=z=\int_{-\infty}^y I(0\le \hat y\le 1) p(\hat y) d\hat y = \int_0^y d\hat y = y,$$ a clear contradiction. — whuber, Oct 02 '18 at 14:15

Iron PS · Answer 1 · 2021-01-31T10:51:03.560

Answering your questions briefly:

The bounds come from the Cumulative Distribution Function (CDF) of $p(y)$. I.e.,

$p(Y\leq y)=\int\limits_{-\infty}^yp(Y=\hat y)d\hat y$

The absolute value $\left | \frac{dz}{dy}\right |$ come from the change of variable. Check this for more details on how to proceed with a change of variable over probability distributions.

For a more detailed answer, check below:

We have that $z$ is uniformly distributed over (0,1), thus

$p(z)=\frac{1}{1-0}=1$

Then, by defining $z=h(y)$ we have that

$p(y)=p(z)\left | \frac{d}{dy}z\right |=1\left | \frac{d}{dy}h(y) \right |$

We want to determine what the change of variable function should look like, thus we isolate it by integrating $p(y)$. We proceed by noticing that the function $h(y)$ is either monotonically increasing or decreasing. Thus, if it is increasingly monotonic we have that the right-hand side of the above equation is equal to

$\int\limits_{-\infty}^y\frac{d}{d\hat y}h(\hat y)d\hat y$

$=h(y)-\underset{n\to -\infty}{\lim}h(n)+C$

$=h(y)-0+C$

$=h(y)+C$

where the bounds make the left-hand side be a CDF on $y$ and $\underset{n\to -\infty}{\lim}h(y)=0$, given that $h(y)$ is monotonically increasing and $h(y)=z$ and $z\in (0,1)$.

In the case of a decreasingly monotonic function we have

$-\int\limits_{-\infty}^y\frac{d}{d\hat y}h(\hat y)d\hat y$

$=-(h(y)-\underset{n\to -\infty}{\lim}h(n))+C$

$=\underset{n\to -\infty}{\lim}h(n)-h(y)+C$

$=1-h(y)+C$

We can set the constant values $C:=0$, and thus we have that \begin{equation} h(y)=\begin{cases} \int\limits_{-\infty}^yp(\hat y)d\hat y, & h(y) \text{ is monotonically increasing} \\ 1-\int\limits_{-\infty}^yp(\hat y)d\hat y, & h(y) \text{ is monotonically decreasing} \end{cases} \end{equation}

Thus, our choice of $h(y)$ can be any of these options.

Hope this helps.

Transformation of probability distribution

1 Answers1