25

I need to calculate matrix inverse and have been using solve function. While it works well on small matrices, solve tends to be very slow on large matrices. I was wondering if there is any other function or combination of functions (through SVD, QR, LU, or other decomposition functions) that can give me faster results.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
jitendra
  • 468
  • 2
  • 6
  • 12
  • 3
    Can you provide more information? What are the approximate dimensions? Does the matrix have any special structure (symmetry, sparsity, etc.)? What is your quantitative definition of "slow"? And "fast"? – cardinal Aug 29 '11 at 18:19
  • The approximate dimensions are like 2000x2000. The matrix doesn't have any special structure. Well, `solve` method definitely does my work but I want the algorithm to be faster. So, I am just wondering if there is a more efficient (in time context) function for calculating inverse for such large size matrix. – jitendra Aug 29 '11 at 18:27
  • 2
    Have you tried any of the other suggestions on the help page for `solve`? Of course, absent special structure, you can't escape the theoretical complexity bounds on general matrix inversion. – cardinal Aug 29 '11 at 18:38
  • 4
    @Cardinal The trick is to probe further concerning the actual application, for as you know, in many cases inverting the matrix is unnecessary (and time-consuming and error-prone). – whuber Aug 29 '11 at 18:44
  • @whuber: This is a very good point. I suppose sometimes I approach these questions a little too directly. – cardinal Aug 29 '11 at 18:54
  • As @whuber suggests, perhaps you can provide more details on the actual problem you are trying to solve? – cardinal Aug 29 '11 at 19:35
  • In this case, I am not working on any particular problem as such. I am just a beginner trying to learn R programming and asked this question out of curiosity. But, I guess without any specific structure of matrix, the solution given by `solve` is optimal one. – jitendra Aug 29 '11 at 23:13
  • I would recommend reading this short note from Douglas Bates ftp://mozilla.c3sl.ufpr.br/CRAN/web/packages/Matrix/vignettes/Comparisons.pdf, it's a nice quick overview of several approaches in R – Matifou May 03 '19 at 04:41

2 Answers2

28

Have you tried what cardinal suggested and explored some of the alternative methods for computing the inverse? Let's consider a specific example:

library(MASS)

k   <- 2000
rho <- .3

S       <- matrix(rep(rho, k*k), nrow=k)
diag(S) <- 1

dat <- mvrnorm(10000, mu=rep(0,k), Sigma=S) ### be patient!

R <- cor(dat)

system.time(RI1 <- solve(R))
system.time(RI2 <- chol2inv(chol(R)))
system.time(RI3 <- qr.solve(R))

all.equal(RI1, RI2)
all.equal(RI1, RI3)

So, this is an example of a $2000 \times 2000$ correlation matrix for which we want the inverse. On my laptop (Core-i5 2.50Ghz), solve takes 8-9 seconds, chol2inv(chol()) takes a bit over 4 seconds, and qr.solve() takes 17-18 seconds (multiple runs of the code are suggested to get stable results).

So the inverse via the Choleski decomposition is about twice as fast as solve. There may of course be even faster ways of doing that. I just explored some of the most obvious ones here. And as already mentioned in the comments, if the matrix has a special structure, then this probably can be exploited for more speed.

Wolfgang
  • 15,542
  • 1
  • 47
  • 74
  • Thanks a lot for this solution. I, at least, know one method that can solve it half the time as compared to `solve` :-) – jitendra Aug 31 '11 at 19:37
  • 10
    The Cholesky decomposition is a good choice for covariance/correlation matrices but keep in mind that in general the matrix has to be Hermitian(in case of real matrices that means symmetric), positive definite matrix. That uses half of the memory required for LU decomposition. – Raxel Mar 06 '15 at 07:11
  • pd.solve is faster! – Ga13 Mar 24 '20 at 22:13
  • @Ga13 Not necessarily (and not on two systems that I tested this on). This might depend on your CPU and which linear algebra routines you are using. – Wolfgang Mar 25 '20 at 11:26
0

If you are working with covariance matrix or any positive definite matrix you can use pd.solve is faster.

Following the Wolfgang example:

library(MASS)
library(mnormt)

k   <- 2000
rho <- .3

S       <- matrix(rep(rho, k*k), nrow=k)
diag(S) <- 1

dat <- mvrnorm(10000, mu=rep(0,k), Sigma=S) ### be patient!

R <- cor(dat)

system.time(RI1 <- solve(R))
system.time(RI2 <- chol2inv(chol(R)))
system.time(RI3 <- qr.solve(R))

> system.time(RI1 <- solve(R))
  usuário   sistema decorrido 
    13.21      0.03     13.76 
> system.time(RI2 <- chol2inv(chol(R)))
  usuário   sistema decorrido 
     5.62      0.05      5.80 
> system.time(RI3 <- qr.solve(R))
  usuário   sistema decorrido 
    20.42      0.09     21.10 
> system.time(RI4 <- pd.solve(R))
  usuário   sistema decorrido 
     5.53      0.00      5.61 
Ga13
  • 236
  • 2
  • 8
  • What package is `pd.solve` in? – Noah Mar 24 '20 at 22:21
  • It is from package 'mnormt' – Ga13 Mar 24 '20 at 23:24
  • This might depend on your CPU and which linear algebra routines you are using. On two systems that I tested this on (both using OpenBLAS), `chol2inv(chol(R)))` was slightly faster than `pd.solve()`. – Wolfgang Mar 25 '20 at 11:21
  • 1
    Also, it would be better to do this more than once (and average), because single results can be skewed by some other process taking up CPU cycles in the background. Using package `microbenchmark` would be even better. – Wolfgang Mar 25 '20 at 11:25
  • 2
    pd.solve is just a wrapper for chol2inv(chol(R))), so it can't be faster (look at the code). – Carlos Llosa Oct 06 '20 at 18:10