10

Linear systems of equations are pervasive in computational statistics. One special system I have encountered (e.g., in factor analysis) is the system

$$Ax=b$$

where $$A=D+ B \Omega B^T$$ Here $D$ is a $n\times n$ diagonal matrix with a strictly positive diagonal, $\Omega$ is an $m\times m$ (with $m\ll n$) symmetric positive semi-definite matrix, and $B$ is an arbitrary $n\times m$ matrix. We are asked to solve a diagonal linear system (easy) that has been perturbed by a low-rank matrix. The naive way to solve the problem above is to invert $A$ using Woodbury's formula. However, that doesn't feel right, since Cholesky and QR factorizations can usually speed up solution of linear systems (and normal equations) dramatically. I recently came up on the following paper, that seems to take the Cholesky approach, and mentions numerical instability of Woodbury's inversion. However, the paper seems in draft form, and I could not find numerical experiments or supporting research. What is the state of the art to solve the problem I described?

gappy
  • 5,390
  • 3
  • 28
  • 50
  • 1
    @gappy, did you consider using QR (or Cholesky) decomposition for matrix $\Omega^{-1}+BD^{-1}B^T$ (the middle term in Woodburry formula)? The remaining operations are simple matrix multiplications. The main source of instability is then the calculation of $\Omega^{-1}$. Since $m< – mpiktas Jun 23 '11 at 13:25
  • I suspect that what Matthias Seeger advocates is within $\epsilon$ of the state of the art, he is a very bright bloke and these sorts of issues crop up repeatedly in the kind of models he investigates. I use Cholesky based methods for the same reasons. I suspect there is discussion in "Matrix Computations" by Golub and Van Loan, which is the standard reference for this sort of thing (although I don't have my copy to hand). – Dikran Marsupial Jun 23 '11 at 15:07
  • Note that by taking $\bar{B} = D^{-1/2} B$ your problem is equivalent to solving the system $(I + \bar{B}\Omega \bar{B}^T)x = \bar{b}$ where $\bar{b} = D^{-1/2} b$. So, that simplifies the problem a bit right there. Now, letting $\Sigma = \bar{B} \Omega \bar{B}^T$, we know that $\Sigma$ is positive semidefinite with at most $m$ positive eigenvalues. Since $m \ll n$, finding the $m$ largest eigenvalues and corresponding eigenvectors can be done in various ways. The solution is then $x = Q(I + \Lambda)^{-1} Q^T \bar{b}$ where $\Sigma = Q \Lambda Q^T$ gives the eigendecomposition of $\Sigma$. – cardinal Jun 23 '11 at 15:38
  • Small corrections: (1) Equivalent system is $(I + \bar{B} \Omega \bar{B}^T) D^{1/2} x = \bar{b}$ and (2) Final solution is $x = D^{-1/2} Q (I + \Lambda)^{-1} Q^T D^{-1/2} b$. (I had dropped a $D^{1/2}$ in front of $x$ in both cases.) Notice that all inverses are of diagonal matrices and so are trivial. – cardinal Jun 23 '11 at 15:51
  • @mpiktas: I think you meant $\Omega^{-1} + B^T D^{-1} B$ since in the version you wrote the matrix product is not well-defined due a dimension mismatch. :) – cardinal Jun 23 '11 at 17:43
  • @mpiktas: Note also that the question statement says only that $\Omega$ is positive semidefinite, not necessarily positive definite (hence invertible). – cardinal Jun 23 '11 at 18:15
  • @cardinal the singular values of B*Omega*B' are the same as Omega, so no need to go through Sigma. The bottleneck in the computation of Woodbury's formula is in the computation of Q*R', where R= Q*(1+Lambda) (which is very fast). That step takes n^2*m. – gappy Jun 24 '11 at 13:33
  • @mpiktas your are correct, Omega is psd, but that a small modification of @cardinal's approach addresses the problem easily. – gappy Jun 24 '11 at 13:35
  • @gappy: I don't understand your comments (yet). First of all, why would $B \Omega B^T$ have the same singular values as $\Omega$? You said $B$ was *arbitrary*. Also, in your second comment, did you get the references to me and @mpiktas swapped? It makes a little more sense that way, though I'm not quite sure at the moment that there is a "simple" modification of his approach to handle a noninvertible $\Omega$. – cardinal Jun 24 '11 at 14:00
  • Are you asking about speed or numerical stability? (or both?) – shabbychef Jun 29 '11 at 22:50

1 Answers1

2

"Matrix Computations" by Golub & van Loan has a detailed discussion in chapter 12.5.1 on updating QR and Cholesky factorizations after rank-p updates.

Fabian Pedregosa
  • 526
  • 4
  • 12
  • I know, and the relevant lapack functions are mentioned both in the paper I linked to and in the book. I wonder however what is the best practice for the problem at hand, not for the generic updating problem. – gappy Jun 24 '11 at 13:26