I've found the following definition of $PCA$ and I was trying to better understand it.
Having a dataset $X \in \mathbb{R}^{d \times m}$ we approximate it as $X \sim V^TA$ by minimizing the loss w.r.t the following constraints:
\begin{align} \mathcal{l}_{PCA}(X; A,&V) := \|X - V^TA\|^2_F\\ &s.t. \, \, A \in \mathbb{R}^{\mathcal{l} \times m} \, , \, V \in \mathbb{R}^{\mathcal{l} \times d} \, , \,VV^T= I_{\mathcal{l}} \end{align}
Now I'm actually trying to better understand what the matrices $A$ and $V$ actually represent. I guess that the matrix $V$ is nothing but the orthogonal matrix made up of the eigenvectors basis and $A$ should be the matrix of the coefficients obtained by projecting the original data contained in $X$ on the subspace spanned by the columns of $V$.
I still have some troubles trying to expand the above formula in a mathematical form:
\begin{align} \|X - V^TA\|^2_F &= \sum_{i=1}^d \sum_{j=1}^m |x_{ij} - (V^TA)_{ij}|^2 \\ & = \, \, \, \, \, ? \end{align}
Should I write the matrix $V^TA$ as some outer product?
I always get stuck when it comes to Matrix algebra notation ...