Does classic MultiDimensional Scaling reconstructs data upto a rotation when there's no noise and embedding dimensions equals original data dimension?

Question

In the setup of classical MultiDimensional Scaling (MDS), assume that $D:=[d_{ij}]$ be an $n \times n$ distance matrix, i.e. $d(i,i)=0, d(i,j)=d(j,i) > 0 \forall i, j = 1 \dots n.$ Assume that: the original data $X:=[x_1 \dots x_n] \in \mathbb{R}^{d \times n}, x_i \in \mathbb{R}^{d \times 1}$ generating the distance matrix $D$ was Euclidean and of dimension $p$. So mathematically, assume: $||x_i - x_j||^2 = d_{ij}^2$.

Assume now that, we're performing classical MDS on the data to embed it into $p$ dimensions again. I'm aware that we won't know $p$, but the point here is not the knowledge of $p$: the point is that (somehow!) the embedding dimension is exactly equal to the original dimension of the data. That is, we're performing the following algorithm:

Define the centering matrix $H_n:= I_n - \frac{1}{n}1_n1_n^{T}.$
Perform the SVD of $-\frac{1}{2}H_nD^2H_n = (XH_n)^{T}(XH_n)$, where $D^2$ is the entrywise square of $D$ (so, not the square of $D$ as in matrix multiplication/squaring). Assume $-\frac{1}{2}H_nD^2H_n= U\Lambda U^{T}= \sum_{i=1}^{n} \lambda_i u_i u_i^{T} $ after SVD.
From the diagonal matrix $\Lambda$ above that contains the eigenvalues $\{\lambda_1 \geq\dots \geq \lambda_n \geq 0\}$ of $-\frac{1}{2}H_nD^2H_n,$ extract the square roots of the $p$ largest eigenvalues, and the corresponding eigenvectors. Call the corresponding part of $\Lambda$ to be $\Lambda_p$, and that of $U^{T}$ to be $U_p^{T}$
Finally, the $p$ -dimensional embedding is given by the $n$ columns of the $p \times n$ matrix $Y_p:=\sqrt\Lambda_p U_p^{T}$.

Assume now that the original, unknown data was centered (we don't have to assume this, but doing this anyway to get rid of the pesky $H_n$ appearing on both sides due to double centering, this'll simplify notations): i.e. assume that $XH_n = X$. It's easy to check that: the embedding $Y_p$ generated by MDS is also centered, i.e. $Y_p H_n = Y_p$, and that $Y_p^{T}Y_p = X^{T}X= \sum_{i=1}^{p} \lambda_i u_i u_i^{T}$, since the rank of $X^{T}X \le d$, so $\lambda_{d+1}= \dots \lambda_n=0$.

My questions are: (Please supply proofs or references to look at)

(1) Is $Y_p = R.X$ for some $R \in O(p)$ in this case?

(2) If not, is there a useful estimate of the distance between the original and MDS constructed data: $||X - Y_p||$, perhaps an upper and a lower bound? Thank you in advance!

What you call "Classic" MDS here is more precisely to name Torgerson MDS or PCoA (Principal coordinate analysis). It sounds to me that you meant this analysis. If so, I could remark that - when the input to the MDS is the euclidean distance matrix between the points than the analysis is essentially _equivalent_ to PCA. https://stats.stackexchange.com/q/14002/3277. PCA with m extracted dimensions = p original dimensions of the data _is_ just a rotation. — ttnphns, Mar 09 '20 at 17:48
(cont.). It is simple to show that in this case the coordinates given by PCoA are identical (up to sign) to the componet scores that would be computed by PCA of those same data (with dimensions (columns) centered) which gave you the distance matrix. — ttnphns, Mar 09 '20 at 17:48
This simply follows from some equivalences between the centration matrix and the scatter matrix mentioned for example here https://stats.stackexchange.com/a/183930/3277. Check also two top paragraps here https://stats.stackexchange.com/q/357966/3277. [Please forgive if you wanted strict proofs or if I misinterpreted your Q altogether.] — ttnphns, Mar 09 '20 at 17:53
@ttnphns Thank you for the comments and the links! I'm sure going through the link in greater detail would be beneficial for me, but for now: I didn't want to go into PCA, which your comments seem to refer to. My question is solely about MDS: if we start with the dist matrix coming from a Euclidean $d$-dim data $X$ with no noise, do MDS an dembed the data into $d$-dim again, then will we get the same data upto a rotation, i.e. if the embedded data matrix is $Y_p$, then is there a $d\times d$ orthogonal matrix $R$ so that $R.Y_p= X$? Think your 2nd comment says yes, but not sure... — Mathmath, Mar 09 '20 at 21:01
(I don't know what mean by "noise" and why it would meddle here.) But the answer is Yes. Let have nxp data X (with p, full rank, dimensions and compute nxn (squared) euclidean distance matrix D. We perform PcoA mds on D to get nxp coordinates Y ("new data"). Now, we center columns (dimensions) of X (call it Xc; of course, such translating does not change D of the data). Then perform _Procrustes_ rotation of Y into Xc: Yr=Y*R, where R is the orthogonal rotation matrix maximally mapping Y into Xc (or back). We find, that Yr=Xc. Complete matching. — ttnphns, Mar 09 '20 at 21:22
The reason of this is that PCoA (X->D->double centrate->eigendecomposition->coordinates) is actually a PCA (only seen from another "side"): Xc->scatter matrix->eigendecomposition->component scores. Both eigendempositions (or svd's) yield the same nonzero (effective) eigenvalues. — ttnphns, Mar 09 '20 at 21:30

Does classic MultiDimensional Scaling reconstructs data upto a rotation when there's no noise and embedding dimensions equals original data dimension?

0 Answers0