11

Say I have the following maximization.

$$ \max_{R: R^T R=I_n} \operatorname{Tr}(RZ),$$ where $R$ is an $n\times n$ orthogonal transformational, and the SVD of $Z$ is written as $Z = USV^T$.

I'm trying to find the optimal $R^*$ which intuitively I know is equal to $VU^T$ where $$\operatorname{Tr}(RZ)=\operatorname{Tr}(VU^T USV^T)=\operatorname{Tr}(S).$$ I know this is the max since it is the sum of all the singular values of $Z$. However, I'm having trouble coming up with a mathematical proof justifying my intuition.

Any thoughts?

glS
  • 6,303
  • 3
  • 28
  • 50
Eagle1992
  • 511
  • 2
  • 6
  • 17

2 Answers2

17

Let $A=\sqrt{S}$, and equip the space of $n\times n$ real matrices with the usual Euclidean scalar product. Then $$\hbox{Tr}(RZ)= \hbox{Tr}(RUA^2V^T)=\hbox{Tr}((RUA)(VA)^T)=\langle RUA,VA\rangle$$ By the Cauchy-Schwarz inequality, we get $$\hbox{Tr}(RZ)\leq \Vert RUA \Vert_2 \Vert VA \Vert_2= \Vert A \Vert_2 \Vert A \Vert_2 =\hbox{Tr}(AA^T)=\hbox{Tr}(S)$$ where we used the invariance of the $\Vert \cdot \Vert_2 $ under orthogonal transformations. the converse inequality, is proved by choosing $R=VU^T$, and we are done.

Martin Argerami
  • 195,360
  • 15
  • 132
  • 259
Omran Kouba
  • 28,320
  • 1
  • 51
  • 96
  • thank you! the proof is easy to understand. however, I would like to understand your thought process behind it, what led you to define $A= \sqrt{S}$ and continue? – Eagle1992 Apr 15 '14 at 19:48
  • I wanted to relate your problem to a quadratic optimization problem, and the fact that $\hbox{tr}(S)$ is the sum of squares of the eigenvalues of $\sqrt{S}$ paved the way. – Omran Kouba Apr 15 '14 at 20:28
  • Old question but can someone correct my confusion: does that inequality really follow from Cauchy-Schwarz? If we take $Z=I$ (so that $U=A=V=I$), then the inequality stated above implies that $Tr(R)\leq\|R\|_2$ for any orthogonal matrix $R$. In particular, if $R=I$, then $Tr(I)=n$ but $\|I\|_2=1$. I think those should be Frobenius norms, not two-norms, right? Since the Frobenius norm is induced by the Frobenius inner product? – David M. Apr 11 '19 at 01:07
  • @DavidM. $\Vert I\Vert_2=n$. The norm associated with the usual Euclidean scalar product on $n\times n$ matrices is the Frobenius norm. – Omran Kouba Apr 17 '19 at 19:13
  • @OmranKouba $\|I\|_2\neq{n}$, right? The two norm is the largest singular value, correct? – David M. Apr 17 '19 at 19:43
  • @DavidM. I think my 2-norm is not the one you use, because yours is not the norm associated with my scalar product $\langle A, B\rangle=tr(AB^T)$ I am talking about the norm associated with this scalar product and I am denoting this norm $\Vert \cdot\Vert_2$. Now, What you are talking about is the operator norm that I would denote just $\Vert\cdot\Vert$. – Omran Kouba Apr 18 '19 at 06:59
  • @OmranKouba Thanks for your response! That would explain it then. So your notation $\|\cdot\|_2$ indicates the Frobenius norm? – David M. Apr 18 '19 at 12:07
0

I'll show how to prove the more general case with complex matrices: find the maximum of $\operatorname{Tr}(UZ)$ over all unitaries $U$: $$\max_{U: U^\dagger U=I}|\operatorname{Tr}(UZ)|.$$

Leveraging the polar decomposition, we know that $Z$ can be always written as $Z=VP$ for some unitary $V$ and positive semi-definite $P\ge0$.

Because the product of unitaries is another unitary, this observation reduces the problem to the following: $$\max_{U: U^\dagger U=I}\operatorname{Tr}(UP).$$ Moreover, observe that the $P$ in the polar decomposition equals $\sqrt{A^\dagger A}$. Denoting with $\{u_k\}$ the eigenvectors of $P$, and $s_k$ the eigenvalues of $P$ (i.e. the singular values of $Z$), we have $P=\sum_k s_k u_k u_k^*,$ and therefore for every unitary $U$ there is an orthonormal basis $\{v_k\}$ such that $$UP=\sum_k s_k v_k u_k^*.$$ It follows that the trace reads $\operatorname{Tr}(UP)=\sum_k s_k \langle u_k,v_k\rangle$, and taking the absolute value,

$$ \lvert\operatorname{Tr}(UP)\rvert=\left\lvert\sum_k s_k = |\langle u_k,v_k\rangle|\right\rvert \le \sum_k s_k \lvert\langle u_k,v_k\rangle\rvert \le\sum_k s_k = \operatorname{Tr}P. \tag X $$ therefore if there is a $U$ such that $\lvert\operatorname{Tr}(UP)\rvert=\sum_k s_k$, that ought to be maximum we are looking for. But finding this $U$ is trivial at this point: just use $U=V^\dagger$.

glS
  • 6,303
  • 3
  • 28
  • 50