1

I have a pdf say $p(x)$. Now, I apply some transformation (may be linear or non-linear) to the variable $x$ say $g(x)$. Let the new pdf be called $p(y)$. For, a small change in $x$ say $dx$, there will be some change in $g(x)$ or $y$ say $dy$. Since the area under the curve has to be same, $p(x)dx = p(y)dy$.

I was studying Bishop Machine Learning and Pattern Recognition and on page 18, it says under nonlinear change of variable, a pdf transforms differently from a normal function. I think it will also change differently for a linear transformation. Secondly, in the book it says, $$p(y)=p(x)\left| \frac{dx}{dy} \right|$$ I also don't understand the mod. $p(x)$, $p(y)$, $dx$, $dy$ can't be negative.

Xi'an
  • 90,397
  • 9
  • 157
  • 575
KAY_YAK
  • 153
  • 1
  • 8
  • 2
    Please consider the transformation $y: x\to -x.$ What is $dy/dx$? – whuber Dec 19 '18 at 20:58
  • Won't that make area negative? – KAY_YAK Dec 19 '18 at 21:00
  • 2
    It makes the *signed* area negative--but that's not relevant for probability calculations, which concern only the absolute area. This transformation clearly demonstrates that $dy = -dx$ can indeed be negative. – whuber Dec 19 '18 at 21:19

1 Answers1

1

It may help to look at the square of $U \sim \mathsf{Unif}(0,1),$ which is $X = U^2 \sim \mathsf{Beta}(.5,1).$ [See Wikipedia.] Because the density of $U$ is $f_U(u) = 1,$ for $0 < u < 1,$ the entire form of the density of $X$ is due to the Jacobian factor $J = |du/dx|.$

In the simulation below, there are $n = 10^5$ values of $U$ and of $X,$ of which approximately $10,000$ fall into each bin of each histogram.

Rectangular bars for both distributions are all of (essentially) the same area: Specifically, the red bin for the $u_i$'s has width $0.1$ and height $1$ giving area $0.1$ Similarly, the corresponding bin for the $x_i$'s has base $(0,.01)$ of length $0.01$ and height determined by the value of $J$ in that interval, which averages about $10.$ [Perhaps more simply, you might investigate what happens to values $0.9 < u_i < 1$ (violet bar) under the transformation.]

enter image description here

set.seed(1219)
u = runif(10^5);  x = u^2 
par(mfrow=c(1,2))
  bin.u = seq(0,1,by=.1)
  hist(u, prob=T, ylim=c(0,10), br=bin.u,
       col=rainbow(12)[1:10])
    curve(dunif(x), add=T, lwd=2)
  bin.x = bin.u^2
  hist(x, prob=T, br=bin.x, col=rainbow(12)[1:10])
    curve(dbeta(x,.5,1), add=T, lwd=2)
par(mfrow=c(1,1))

Note: It is generally a good idea to choose all bars in a histogram to be of the same width, unless there is a good reason to do otherwise (as here).

BruceET
  • 47,896
  • 2
  • 28
  • 76
  • 1
    +1. A very closely related question (and, if I may say so, a very similar answer) appear at https://stats.stackexchange.com/questions/14483. I don't maintain they are duplicates, though, because they do not explicitly address the issue of a negative Jacobian. – whuber Dec 19 '18 at 22:42
  • Thanks. I flipped a coin to decide between $X^2$ and $\sqrt{X},$ which might have been less redundant. I hope @KAY_YAK will follow your link for its mathematical detail. – BruceET Dec 19 '18 at 22:47