Derivation of change of variables of a probability density function?

Question

In the book pattern recognition and machine learning (formula 1.27), it gives

$$p_y(y)=p_x(x) \left | \frac{d x}{d y} \right |=p_x(g(y)) | g'(y) |$$ where $x=g(y)$, $p_x(x)$ is the pdf that corresponds to $p_y(y)$ with respect to the change of the variable.

The books says it's because that observations falling in the range $(x, x + \delta x)$ will, for small values of $\delta x$, be transformed into the range $(y, y + \delta y)$.

How is this derived formally?

Update from Dilip Sarwate

The result holds only if $g$ is a strictly monotone increasing or decreasing function.

Some minor edit to L.V. Rao's answer $$ \begin{equation} P(Y\le y) = P(g(X)\le y)= \begin{cases} P(X\le g^{-1}(y)), & \text{if}\ g \text{ is monotonically increasing} \\ P(X\ge g^{-1}(y)), & \text{if}\ g \text{ is monotonically decreasing} \end{cases} \end{equation}$$ Therefore if $g$ is monotonically increasing $$F_{Y}(y)=F_{X}(g^{-1}(y))$$ $$f_{Y}(y)= f_{X}(g^{-1}(y))\cdot \frac{d}{dy}g^{-1}(y)$$ if monotonically decreasing $$F_{Y}(y)=1-F_{X}(g^{-1}(y))$$ $$f_{Y}(y)=- f_{X}(g^{-1}(y))\cdot \frac{d}{dy}g^{-1}(y)$$ $$\therefore f_{Y}(y) = f_{X}(g^{-1}(y)) \cdot \left | \frac{d}{dy}g^{-1}(y) \right |$$

The result holds only if $g$ is a strictly monotone increasing or decreasing function. Draw a graph of $g$ and puzzle it out using the basic idea behind the definition of the derivative (not the formal definition with epsilon and delta). Also, there is an answer by @whuber on this site which spells out the details; that is, this should be closed as a duplicate. — Dilip Sarwate, Oct 11 '16 at 15:32
Your book's explanation is reminiscent of the one I offered at http://stats.stackexchange.com/a/14490/919. I also posted a general algebraic method at http://stats.stackexchange.com/a/101298/919 and a geometric explanation at http://stats.stackexchange.com/a/4223/919. — whuber, Oct 11 '16 at 16:30
@DilipSarwate thanks for your explanation, I think I understand the intuition, but I'm more interested in how it can be derived using the existing rules and theorems :) — dontloo, Oct 12 '16 at 06:18

score 40 · Accepted Answer · edited Feb 24 '21 at 14:34

40

Suppose $X$ is a continuous random variable with pdf $f$. If we define $Y=g(X)$, where $g$ is a monotone function, then the pdf of $Y$ is obtained as follows: \begin{eqnarray*} P(Y\le y) &=& P(g(X)\le y)\\ &=& P(X\le g^{-1}(y))\\ or\;\;F_{Y}(y)&=& F_{X}(g^{-1}(y)),\quad \mbox{by the definition of CDF}\\ \end{eqnarray*} By differentiating the CDFs on both sides w.r.t. $y$, we get the pdf of $Y$. The function $g$ could be either monotonically increasing or monotonically decreasing. If the function $g$ is monotonically increasing, then the pdf of $Y$ is given by \begin{equation*} f_{Y}(y)= f_{X}(g^{-1}(y))\cdot \frac{d}{dy}g^{-1}(y) \end{equation*} and other hand, if it is monotonically decreasing, then the pdf of $Y$ is given by \begin{equation*} f_{Y}(y)= - f_{X}(g^{-1}(y))\cdot \frac{d}{dy}g^{-1}(y) \end{equation*} The above two equations can be combined into a single equation: \begin{equation*} \therefore f_{Y}(y) = f_{X}(g^{-1}(y))\cdot |\frac{d}{dy}g^{-1}(y)| \end{equation*}

edited Feb 24 '21 at 14:34

mgarort

197
7

answered Oct 11 '16 at 13:43

L.V.Rao

1,909
15
23

But as the integral over fx must sum to 1 and fy is a scaled version of fx, doesn't that mean fy is not a proper pdf, unless the jacobian in the abs() is 1 or -1? – Chris Oct 11 '18 at 18:18
1

@Chris The Jacobian of $g^{-1}$ is not necessarily a constant function, so it can be >1 in some places and <1 in others. – Yatharth Agarwal Jan 19 '19 at 19:33
1

I believe the above derivation is incorrect. When $g(.)$ is monotonically decreasing, $g(X) \le y \implies X \ge g^{-1}(y)$. The minus sign does not magically appear. – Tom Bennett Feb 21 '20 at 20:31
1

The minus sign comes from the fact that the inequality is switched for monotonically decreasing transformations – Sebastian Mar 08 '20 at 00:05
Where is the justification/explanation of the bit about monotonicity – user3180 Jul 04 '21 at 08:19

Derivation of change of variables of a probability density function?

1 Answers1

Linked