4

Let $\mbox{cond} (M) := \frac{\sigma_1 (M)}{\sigma_n (M)}$ be the condition number of matrix $M$. Is $$\mbox{cond} ([A,B]) \leq \mbox{cond}(A) + \mbox{cond}(B)$$

true? And is this true for $n \times m$ rectangular matrices? Let's consider $3$ different cases:

  1. $n<m$

  2. $n=m$

  3. $n>m$

Rodrigo de Azevedo
  • 20,693
  • 5
  • 43
  • 100
2012User
  • 371
  • 4
  • 13

1 Answers1

7

This could be true if $n\leq m$ at least in the case when all matrices involved have full rank equal to $n$. Because the singular values of $[A,B]$ are the square roots of the eigenvalues of $AA^T+BB^T$, then we have $$ \begin{split} \sigma_n^2([A,B]) &= \lambda_n(AA^T+BB^T) =\min_{\|x\|_2=1}x^T(AA^T+BB^T)x\\&\geq\min_{\|x\|_2=1}x^TAA^Tx=\lambda_n(AA^T)=\sigma_n^2(A) \end{split} $$ Similarly, $$ \begin{split} \sigma_n^2([A,B]) &= \lambda_n(AA^T+BB^T) =\min_{\|x\|_2=1}x^T(AA^T+BB^T)x\\&\geq\min_{\|x\|_2=1}x^TBB^Tx=\lambda_n(BB^T)=\sigma_n^2(B) \end{split} $$ and hence $$ \sigma_n([A,B])\geq\max\{\sigma_n(A),\sigma_n(B)\}. $$ Similarly, variational characterisation of singular values also implies that $$ \sigma_1^2([A,B])\leq\sigma_1^2(A)+\sigma_1^2(B). $$ Indeed, $$ \begin{split} \sigma_1^2([A,B]) &= \lambda_1(AA^T+BB^T) =\max_{\|x\|_2=1}x^T(AA^T+BB^T)x\\&\leq\max_{\|x\|_2=1}x^TAA^Tx+\max_{\|x\|_2=1}x^TBB^Tx=\lambda_1(AA^T)+\lambda_1(BB^T)\\&=\sigma_1^2(A)+\sigma_1^2(B) \end{split} $$ Hence $$ \mathrm{cond}^2([A,B])\leq\frac{\sigma_1^2(A)+\sigma_1^2(B)}{\max\{\sigma_n^2(A),\sigma_n^2(B)\}}\leq\frac{\sigma_1^2(A)}{\sigma_n^2(A)}+\frac{\sigma_1^2(B)}{\sigma_n^2(B)}\leq\mathrm{cond}^2(A)+\mathrm{cond}^2(B) $$ and $$ \mathrm{cond}([A,B])\leq\sqrt{\mathrm{cond}^2(A)+\mathrm{cond}^2(B)} \leq\mathrm{cond}(A)+\mathrm{cond}(B). $$

For $n>m$, it's generally not true (still assuming full rank of both $A$, $B$, and $[A,B]$). As an extreme example, take $m=1$. Then $\mathrm{cond}(A)=\mathrm{cond}(B)=1$ but $\mathrm{cond}([A,B])$ can be arbitrarily large. Consider, e.g., $A=[1,0]^T$ and $B=[1,0.001]^T$.

The rank-deficient case is a bit more difficult if you considered $\mathrm{cond}(A)=\sigma_1(A)/\sigma_r(A)$, where $r$ is the rank of $A$ (this is how the condition number is usually defined when $A$ does not have full rank). The trouble here is not with the upper bound on $\sigma_1([A,B])$ but with the lower bound which does not generally hold. Again, as an extreme example, consider $A$ and $B$ of the case 1) and 2) and augment them with a sufficient number of zero rows to convert it to the case 3) (which we know where the statement is false).

Algebraic Pavel
  • 22,130
  • 5
  • 33
  • 58
  • What did you mean by the big bracket?, Sum of them? Is this what you meant? $min_{\|x\|=1}$$[ x^T(AA^T)x+x^T(BB^T)x ]$ $\geq$ $\sigma_n^2(A)$+ $\sigma_n^2(B)$ $\geq$$max$ {$\sigma_n^2(A)$ , $\sigma_n^2(B)$} which leaves $min$ {$a+b$} $\geq$ $min$ {$a$} + $min$ {$b$} $\geq$ $max$ {$min${$a$},$min${$b$}} I don't get it!!! – 2012User Nov 13 '13 at 04:39
  • @User1986 I thought it was obvious since you accepted the answer already :D I've edited the answer to make it less "compact" and avoid the confusion. – Algebraic Pavel Nov 13 '13 at 10:39
  • Thank You again :) How do you conclude: $\sigma_n([A,B])$$\geq$$max$ {$\sigma_n(A)$ , $\sigma_n(B)$} ==> $\sigma_n([A,B])$$\leq$$min$ {$\sigma_n(A)$ , $\sigma_n(B)$} for the inequality to be true? – 2012User Nov 13 '13 at 11:54
  • You're welcome :) Thanks for noticing, it should of course be max. – Algebraic Pavel Nov 13 '13 at 12:48
  • I guess Variational Characterization of singular values (stemmed from Norm Characterizations ) holds for sum of 2 matrices, we have a block matrix here. how we can prove this? – 2012User Nov 13 '13 at 15:08
  • 1
    Edited... I've just realised that the sum of cond's can be replaced by the square root of the sum of their squares, leading to a possible improvement. – Algebraic Pavel Nov 13 '13 at 15:53
  • All right, So $\sigma_1$ proof is based on: $max${$a+b$}$\leq$$max${a}+$max${$b$} some sort of The "Triangular inequality", right? where $a=x^T(A^TA)x ,b= x^T(B^TB)x$. Is this one for sure? – 2012User Nov 13 '13 at 16:25
  • 1
    More or less: $\max_{x\in S}(f(x)+g(x))\leq\max_{x\in S}f(x)+\max_{x\in S}g(x)$. – Algebraic Pavel Nov 13 '13 at 16:27
  • Just one more thing, where (which books) can I find this theorem? – 2012User Nov 14 '13 at 21:40
  • I don't remember if I saw it anywhere. It's just using twice the Courant-Fischer-Weyl min-max theorem. – Algebraic Pavel Nov 14 '13 at 23:34
  • I just read somethings about it in wikipedia. What does $S$ refer to? I'm sorry that I ask too many questions. :( – 2012User Nov 15 '13 at 03:42
  • You mean in the comment? It's a set. Related to the question, $S=\{x:\|x\|_2=1\}$, $f(x)=x^TAA^Tx$, and $g(x)=x^TBB^Tx$. – Algebraic Pavel Nov 15 '13 at 10:05
  • Interesting, By the way I think it should be $A^TA$ and $B^TB$ instead of $AA^T$ and $BB^T$ everywhere. Am I right? – 2012User Nov 15 '13 at 11:20
  • Why? $[A,B][A,B]^T=AA^T+BB^T$. – Algebraic Pavel Nov 15 '13 at 11:43
  • That's it. The definition of norm says: $\|A\|^2=max$$\frac{\|Ax\|}{\|x\|}=max$$\frac{x^TA^TAx}{x^Tx}=\lambda_{max}(A^TA)$ . The sequence of $AA^T$ should be reversed. – 2012User Nov 15 '13 at 12:50
  • $A^TA$ and $AA^T$ have the same *nonzero* eigenvalues. – Algebraic Pavel Nov 15 '13 at 13:02
  • I still can't use the min-max theorem twice to achieve the inequality. sorry, can you explain this part too? I desperately need to know. Thanks again. – 2012User Nov 17 '13 at 21:40
  • What is not clear now? I've used it in the proof above. Is there something else? – Algebraic Pavel Nov 18 '13 at 06:26
  • I mean how we can conclude $max_{x\in S}(f(x)+g(x))\leq max_{x\in S}f(x)+max_{x\in S}g(x)$ from the min-max theorem. honestly, I'd never heard about this theorem before. – 2012User Nov 18 '13 at 08:42
  • Where have I attempted to do this? This is a basic fact about $\max/\min$ (or $\inf/\sup$ if you like). I've never said it follows from the min-max theorem of eigenvalues... – Algebraic Pavel Nov 18 '13 at 10:32
  • Ok, Problem is that we know these functions $f$ and $g$ are positive everywhere, (right?) and we know if two quantities in the inequality are having the same sing (+ or -) thus the inequality turns into an equality ( < to =). although it doesn't affect the proof. – 2012User Nov 18 '13 at 11:49