Let's say we have a random vector $\vec{X} \in \mathbb{R}^n$, drawn from a distribution with probability density function $f_\vec{X}(\vec{x})$. If we linearly transform it by a full-rank $n \times n$ matrix $A$ to get $\vec{Y} = A\vec{X}$, then the density of $\vec{Y}$ is given by $$ f_{\vec{Y}}(\vec{y}) = \frac{1}{\left|\det A\right|}f_{\vec{X}}(A^{-1}\vec{y}). $$
Now say we transform $\vec{X}$ instead by an $m \times n$ matrix $B$, with $m > n$, giving $\vec{Z} = B\vec{X}$. Clearly $Z \in \mathbb{R}^m$, but it "lives on" an $n$-dimensional subspace $G \subset \mathbb{R}^m$. What is the conditional density of $\vec{Z}$, given that we know it lies in $G$?
My first instinct was to use the pseudo-inverse of $B$. If $B = U S V^T$ is the singular value decomposition of $B$, then $B^+ = V S^+ U^T$ is the pseudo-inverse, where $S^+$ is formed by inverting the non-zero entries of the diagonal matrix $S$. I guessed that this would give $$ f_\vec{Z}(\vec{z}) = \frac{1}{\left|\det^+ S\right|} f_\vec{X}(B^+ \vec{z}), $$ where by $\det^+ S$ I mean the product of the non-zero singular values.
This reasoning agrees with the density for a singular normal (conditioned on knowledge that the variable lives on the appropriate subspace) given here and mentioned also here and in this CrossValidated post.
But it isn't right! The normalization constant is off. A (trivial) counterexample is given by considering the following case: With $X \sim \mathcal{N(0, 1)}$, let $$ \vec{Y} = \begin{pmatrix}1 \\ 1\end{pmatrix} X = \begin{pmatrix}X \\ X\end{pmatrix}. $$ Here the matrix $B$ from above is just the ones vector. Its pseudo-inverse is $$ B^+ = \begin{pmatrix}1/2 & 1/2\end{pmatrix} $$ and $\det^+ B = \sqrt{2}$. The reasoning from above would suggest $$ f_\vec{Y}(\vec{y}) = \frac{1}{\sqrt{2\pi}\sqrt{2}}\exp\left(-\frac{1}{2}\vec{y}^T (B^+)^T B^+ \vec{y}\right), $$ but this in fact integrates (on the line $y = x$) to $\frac{1}{\sqrt{2}}$. I realize in this case you could just drop one of the entries of $\vec{Y}$ you're done, but when $B$ is much larger identifying the set of entries to drop is annoying. Why doesn't the pseudo-inverse reasoning work? Is there a general formula for the density function of a linear transformation of a set of random variables by a "tall" matrix? Any references would be greatly appreciated as well.