What the operation $C^{-\frac{1}{2}}$ refers at is the decorrelation of the underlying sample to uncorrelated components; $C^{-\frac{1}{2}}$ is used as whitening matrix. This is natural operation when looking to analyse each column/source of the original data matrix $A$ (having a covariance matrix $C$), through an uncorrelated matrix $Z$. The most common way of implementing such whitening is through the Cholesky decomposition (where we use $C = LL^T$, see this thread for an example with "colouring" a sample) but here we use slightly less uncommon Mahalanobis whitening (where we use $C= C^{0.5} C^{0.5}$).
The whole operation in R would go a bit like this:
set.seed(323)
N <- 10000;
p <- 3;
# Define the real C
( C <- base::matrix( data =c(4,2,1,2,3,2,1,2,3), ncol = 3, byrow= TRUE) )
# Generate the uncorrelated data (ground truth)
Z <- base::matrix( ncol = 3, rnorm(N*p) )
# Estimate the colouring matrix C^0.5
CSqrt <- expm::sqrtm(C)
# "Colour" the data / usually we use Cholesky (LL^T) but using C^0.5 valid too
A <- t( CSqrt %*% t(Z) )
# Get the sample estimated C
( CEst <- round( digits = 2, cov( A )) )
# Estimate the whitening matrix C^-0.5
CEstInv <- expm::sqrtm(solve(CEst))
# Whiten the data
ZEst <- t(CEstInv %*% t(A) )
# Check that indeed we have whitened the data
( round( digits = 1, cov(cbind(ZEst, Z) ) ) )
So to succinctly answer the question raised:
- It means that we can decorrelate the sample $A$ that is associated with that covariance matrix $C$ in such way that we get uncorrelated components. This is commonly referred as whitening.
- The general Linear Algebra idea it assumes is that a (covariance) matrix can be used as a projection operator (to generate a correlated sample by "colouring") but so does the inverse of it (to decorrelate/"whiten" a sample).
- Yes, the easiest way to raise a valid covariance matrix to any power (the negative square root is just a special case) by using the eigen-decomposition of it; $C = V \Lambda V^T$, $V$ being an orthonormal matrix holding the eigenvectors of $C$ and $\Lambda$ being a diagonal matrix holding the eigenvalues. Then we can readily change the diagonal matrix $\Lambda$ as we wish and get the relevant result.
A small code snippet showcasing point 3.
# Get the eigendecomposition of the covariance matrix
myEigDec <- eigen(cov(A))
# Use the eigendecomposition to get the inverse square root
myEigDec$vectors %*% diag( 1/ sqrt( myEigDec$values) ) %*% t(myEigDec$vectors)
# Use the eigendecomposition to get the "negative half power" (same as above)
myEigDec$vectors %*% diag( ( myEigDec$values)^(-0.5) ) %*% t(myEigDec$vectors)
# And to confirm by the R library expm
solve(expm::sqrtm(cov(A)))