Find the maximum of $\operatorname{Tr}(RZ)$ over all orthogonal matrices $R$

Question

Say I have the following maximization.

$$ \max_{R: R^T R=I_n} \operatorname{Tr}(RZ),$$ where $R$ is an $n\times n$ orthogonal transformational, and the SVD of $Z$ is written as $Z = USV^T$.

I'm trying to find the optimal $R^*$ which intuitively I know is equal to $VU^T$ where $$\operatorname{Tr}(RZ)=\operatorname{Tr}(VU^T USV^T)=\operatorname{Tr}(S).$$ I know this is the max since it is the sum of all the singular values of $Z$. However, I'm having trouble coming up with a mathematical proof justifying my intuition.

Any thoughts?

edit 1: apologies, I meant to write maximization not minimization. — Eagle1992, Apr 15 '14 at 03:54

score 17 · Accepted Answer · edited May 29 '14 at 23:52

17

Let $A=\sqrt{S}$, and equip the space of $n\times n$ real matrices with the usual Euclidean scalar product. Then $$\hbox{Tr}(RZ)= \hbox{Tr}(RUA^2V^T)=\hbox{Tr}((RUA)(VA)^T)=\langle RUA,VA\rangle$$ By the Cauchy-Schwarz inequality, we get $$\hbox{Tr}(RZ)\leq \Vert RUA \Vert_2 \Vert VA \Vert_2= \Vert A \Vert_2 \Vert A \Vert_2 =\hbox{Tr}(AA^T)=\hbox{Tr}(S)$$ where we used the invariance of the $\Vert \cdot \Vert_2 $ under orthogonal transformations. the converse inequality, is proved by choosing $R=VU^T$, and we are done.

edited May 29 '14 at 23:52

Martin Argerami

195,360
15
132
259

answered Apr 15 '14 at 07:20

Omran Kouba

28,320
1
51
96

thank you! the proof is easy to understand. however, I would like to understand your thought process behind it, what led you to define $A= \sqrt{S}$ and continue? – Eagle1992 Apr 15 '14 at 19:48
I wanted to relate your problem to a quadratic optimization problem, and the fact that $\hbox{tr}(S)$ is the sum of squares of the eigenvalues of $\sqrt{S}$ paved the way. – Omran Kouba Apr 15 '14 at 20:28
Old question but can someone correct my confusion: does that inequality really follow from Cauchy-Schwarz? If we take $Z=I$ (so that $U=A=V=I$), then the inequality stated above implies that $Tr(R)\leq\|R\|_2$ for any orthogonal matrix $R$. In particular, if $R=I$, then $Tr(I)=n$ but $\|I\|_2=1$. I think those should be Frobenius norms, not two-norms, right? Since the Frobenius norm is induced by the Frobenius inner product? – David M. Apr 11 '19 at 01:07
@DavidM. $\Vert I\Vert_2=n$. The norm associated with the usual Euclidean scalar product on $n\times n$ matrices is the Frobenius norm. – Omran Kouba Apr 17 '19 at 19:13
@OmranKouba $\|I\|_2\neq{n}$, right? The two norm is the largest singular value, correct? – David M. Apr 17 '19 at 19:43
@DavidM. I think my 2-norm is not the one you use, because yours is not the norm associated with my scalar product $\langle A, B\rangle=tr(AB^T)$ I am talking about the norm associated with this scalar product and I am denoting this norm $\Vert \cdot\Vert_2$. Now, What you are talking about is the operator norm that I would denote just $\Vert\cdot\Vert$. – Omran Kouba Apr 18 '19 at 06:59
@OmranKouba Thanks for your response! That would explain it then. So your notation $\|\cdot\|_2$ indicates the Frobenius norm? – David M. Apr 18 '19 at 12:07

score 0 · Answer 2 · answered Nov 08 '19 at 10:50

I'll show how to prove the more general case with complex matrices: find the maximum of $\operatorname{Tr}(UZ)$ over all unitaries $U$: $$\max_{U: U^\dagger U=I}|\operatorname{Tr}(UZ)|.$$

Leveraging the polar decomposition, we know that $Z$ can be always written as $Z=VP$ for some unitary $V$ and positive semi-definite $P\ge0$.

Because the product of unitaries is another unitary, this observation reduces the problem to the following: $$\max_{U: U^\dagger U=I}\operatorname{Tr}(UP).$$ Moreover, observe that the $P$ in the polar decomposition equals $\sqrt{A^\dagger A}$. Denoting with $\{u_k\}$ the eigenvectors of $P$, and $s_k$ the eigenvalues of $P$ (i.e. the singular values of $Z$), we have $P=\sum_k s_k u_k u_k^*,$ and therefore for every unitary $U$ there is an orthonormal basis $\{v_k\}$ such that $$UP=\sum_k s_k v_k u_k^*.$$ It follows that the trace reads $\operatorname{Tr}(UP)=\sum_k s_k \langle u_k,v_k\rangle$, and taking the absolute value,

$$ \lvert\operatorname{Tr}(UP)\rvert=\left\lvert\sum_k s_k = |\langle u_k,v_k\rangle|\right\rvert \le \sum_k s_k \lvert\langle u_k,v_k\rangle\rvert \le\sum_k s_k = \operatorname{Tr}P. \tag X $$ therefore if there is a $U$ such that $\lvert\operatorname{Tr}(UP)\rvert=\sum_k s_k$, that ought to be maximum we are looking for. But finding this $U$ is trivial at this point: just use $U=V^\dagger$.

Find the maximum of $\operatorname{Tr}(RZ)$ over all orthogonal matrices $R$

2 Answers2

Linked