Solving for a pdf of a function of a continuous random variable. Justification and reason for the procedure

Question

Short version: When solving for the pdf of a function of a continuous random variable(say, $Y=X^2$), why can't you just plug in inverse of that($\pm\sqrt{x}$) into the pdf of the RV? Why do you have to start from the cdf of the $Y$, substitute $Y$ with $X^2$, and differentiate that to get the pdf?

The point of this question is regarding the intermediate step involved.

Why is it not allowed to simple plug-in to the pdf of RV?
why is necessary and also justified to do so involving the cdf?

Long version : Let's say you have a random variable $X$ and a function of the random variable, $g(X)$ (such as $Y=X^2$). You are interested in finding the pdf of $Y$, i.e. $f_Y(y)$.

Usually, to do this, you have to start with the cdf of $Y$ first, and differentiate that to get a pdf of it. For example,

$$F_Y(y)=P(Y\le y)$$ and here you substitute $Y$ with $X^2$:

$$P(Y=X^2 \le y)=2F_X (\sqrt{y})-1$$

and you you differentiate this to get pdf of $Y$ $$d(2F_X(\sqrt{y})-1)/dy=f_Y(y)$$

My question is : WHY? why not just solving for $$f_X(\pm\sqrt{y})$$ what is wrong with this approach? and why is it justified to use cdf to solve this?

Please explain this in a layman's term since clearly I am a novice in statistics

score 4 · Accepted Answer · answered Sep 28 '13 at 16:25

It isn't really a statistics issue; it's a calculus issue. Remember the chain rule for taking derivatives:

$\frac{\text{d}F(h(x))}{\text{d}x} = \frac{\text{d}F(h(x))}{\text{d}h(x)} \frac{\text{d}h(x)}{\text{d}x}$

If you just substitute $h(x)$ in for $y$ where $y=h(x)$ in the expression for the pdf $f(y) = \frac{\text{d}F(y)}{\text{d}y}$, you can see that you are coming up with the first term on the r.h.s. of the above equality, but not the second. The second is called the "Jacobian of the transform".

If you don't include the Jacobian, then your cdf and pdf won't match up; the integral of your pdf over a given range won't equal the calculated values based on the cdf. On an intuitive (and therefore imprecise) level, you can think of $y$ and $x$ having different "scales", and the Jacobian converts from one to the other.

The cdf is justified as the basis for this essentially as follows. I'll do a univariate case, but it extends directly to the multivariate case with more typing. Consider $F_Y(y) = P(Y \leq y)$ and the transform $y = g(x)$. Clearly $P(g(X) \leq y) = P(Y \leq y)$ where $Y = g(X)$. So, substituting $g(X)$ in for $Y$ in $F$ works, with no additional terms needed.

Solving for a pdf of a function of a continuous random variable. Justification and reason for the procedure

1 Answers1

Linked