If $$ \mathbf{X} = (X_1,..,X_n)^t$$ is a random variable drawn according to a probability density function (pdf) $$ f_{X_1,...,X_n}(x_1,...,x_n) $$ then $$ \mathbf{Y} = A\mathbf{X} = (Y_1,..,Y_n)^t$$ with $A$ a square non-singular matrix, has a pdf given by: $$ f_{Y_1,...,Y_n}(y_1,...,y_n)=\frac{f_\mathbf{X}(A^{-1} \mathbf{y})}{|A|} $$ I have heard that this can be generalised to the case of a non-square matrix with the Moore–Penrose pseudo-inverse concept, where now $$ \mathbf{Y} = A\mathbf{X} = (Y_1,..,Y_m)^t \qquad m<n $$ and $$ f_{Y_1,...,Y_m}(y_1,...,y_m)=\frac{f_\mathbf{X}(A^{+} \mathbf{y})}{|A|_{+}} $$ with $A^+$ the pseudo-inverse of $A$ and $|A|_{+}$ the pseudo-determinant.
If this is right, how can it be proved? and more important, what is the intuition behind this generalisation? I've only found this related question, but I can't understand how the OP finds the general expression for $f_{Y_1,...,Y_m}(y_1,...,y_m)$.