Given a $n$ by $p$ matrix $\pmb X$ the SVD decomposition of $\pmb X$ is:
$$\text{svd}((\pmb X-\bar{x})/\sqrt{n-1})=\pmb{UDV}'$$
(I will denote $\pmb V_k$ the matrix formed of the first $k$ columns of $\pmb V$ and $\pmb D_k$ the diagonal matrix formed of the first $k$ rows and columns of $\pmb D$)
The SVD decomposition divides the total variance of $\pmb X$ unto two
mutually orthogonal components:
- The variance of the projection of $(\pmb X-\bar{x})$ on the space spanned by the first $k$ singular vectors of $(\pmb X-\bar{x})$:
$$\sqrt{(\pmb X-\bar{x})'\pmb{V_kD_k^{-1}V_k'}(\pmb X-\bar{x})}$$
- The variance of the projection of $(\pmb X-\bar{x})$ on the space orthogonal to the first $k$ singular vectors of $(\pmb X-\bar{x})$:
$$\sqrt{(\pmb X-\bar{x})'(\pmb{I_k}-\pmb{V_kV_k'})(\pmb X-\bar{x})}$$
(where $\pmb I_k$ is the rank $k$ identity matrix) which is also equivalent to:
$$||\pmb X-\bar{x}-(\pmb X-\bar{x})\pmb{V_k^{}V_k'}||$$.
In R
n<-100
p<-20
k<-5
x<-matrix(rnorm(n*p),nc=p) #your data matrix
#the orthogonal distances:
data_centered<-sweep(x,2,colMeans(x),FUN="-")
loadings<-svd(data_centered/sqrt(nrow(data_centered)-1),nu=0)$v[,1:k]
orthDist<-data_centered-data_centered%*%loadings%*%t(loadings)
orthDist<-sqrt(rowSums(orthDist*orthDist))
you will find a more complete code to compute OD (and SD, the statistical distances on the space spanned by the loading matrix)
in the rrcov
package:
library(rrcov)
rrcov:::.distances
the function is not documented, but the slot res@od
therein are the orthogonal
distances.