3

The linear discriminant analysis algorithm is as follows:

LDA algorithm

I want to conduct a computational complexity for it. For each step, the complexity is as follows:

  1. For each $c$, there are $N_cd$ additions and $1$ division. Thus, in total, there are $Nd+C$ operations.
  2. $Nd$ additions and $1$. Thus, in total, there are $(Nd+1)$ operations.
  3. There are $N(d+d^2)$, the first $d$ is for $x_j-m_c$, and the second $d^2$ is for ${(x_j-m_c)}{(x_j-m_c)}^T$.
  4. There are $C(d+d^2+d^2)$, the first $d$ is for $m_c-m$, the second $d^2$ is for ${(m_c-m)}{(m_c-m)}^T$, and the third $d^2$ is for the multiplication between $N_c$ and the matrix.
  5. There are $d^3+d^3+d^3$, the first $d^3$ is for ${S_w}^{-1}$ and the second $d^3$ is for the multiplication between ${S_w}^{-1}$ and $S_b$, and the third $d^3$ is for the eigen-decomposition.

In general, we have $N>d>C$, thus the dominated term is $Nd^2$. To sum up, the computational time complexity is $O(Nd^2)$.

My question is: is the analysis right? If not, please help correct it. I have this question because I've read some papers on this subject, but there are different results among them. Any comments would be highly appreciated.

mining
  • 789
  • 5
  • 11
  • 23
  • What's about the complexity of the asymmetric eigen-problem (point 5)? I think that this question is not statistical and would better go to a computer science forum. – ttnphns Mar 20 '15 at 06:44
  • Hi, @ttnphns, thank you for replying! In this case, the matrix is symmetric, thus I just consider the symmetric eigen-problem. In fact, the complexity of an eigen-problem have several complexity that depends on the solution. A description is referred to [http://mathoverflow.net/questions/62904/complexity-of-eigenvalue-decomposition]. Thus I just use the ordinal algorithms, not the fastest speeding algorithms. In addition, because I think the linear discriminant analysis is on statistical, I post it here. – mining Mar 20 '15 at 06:56
  • 1
    Are you sure that it is symmetric? For all that I know it is not ([see](http://stats.stackexchange.com/a/48859/3277) LDA algo in footnote 1). – ttnphns Mar 20 '15 at 07:04
  • @ttnphns, oh, yes! I made a mistake, the $S_w$ may not be symmetric. Thus, the eigen-problem is for a asymmetric matrix. Thank you very much for pointing out this problem and the reference. – mining Mar 20 '15 at 07:14
  • Could anybody help give a hint, please? – mining Mar 21 '15 at 00:35
  • 1
    I'm a little surprised at the close votes--this seems like a pretty reasonable question for Cross Validated, though it'd probably be on topic at CS too.... – Matt Krause Mar 21 '15 at 01:21
  • @MattKrause,thanks for your support! In fact, I've also post this question in the Theoretical Computer Science [http://cstheory.stackexchange.com/questions/30868/computational-complexity-for-linear-discriminant-analysis]. Maybe this problem is too simple or basic. Thus, I've been reading some books on the algorithmic analysis. – mining Mar 21 '15 at 01:48
  • 1
    Just FYI, people generally don't like cross-posting the same question to multiple sites (at least, not right away). If you want to move it to CS, I think a mod can make that happen for you. – Matt Krause Mar 21 '15 at 03:23
  • 1
    For general (i.e. k-class, also called "canonical") LDA algorithm please see https://stats.stackexchange.com/a/48859/3277 and further links there. – ttnphns Jan 11 '18 at 17:12

1 Answers1

3

There are two potential bottlenecks:

  1. As you pointed out, finding the within-class scatter in step 3 takes $N(d+d^2)$ steps, or $O(Nd^2)$.

  2. Both the matrix multiplication and eigendecomposition in step 5 take approximately $O(d^3)$. Depending on the algorithm, matrix multiplication can be a bit faster: there are algorirthms that are $\sim O(n^{2.4})$ and you can also cheat a bit because you don't need to compute every single eigenvector.

The other steps are relatively trivial, so your answer depends on whether $N>d$. If you have more features than examples, it's $O(d^3)$, otherwise it's $O(Nd^2)$

This paper by Cai, He, and Han walks through the space/time complexity in more detail. Annoyingly, they use $m$ to mean the number of examples, which you called $N$, while using $N$ to refer to the number of features, which you call $d$, so beware.

The paper also shows a related algorithm that is more time and space efficient.

Matt Krause
  • 19,089
  • 3
  • 60
  • 101
  • Hi, @MattKrause, thank you very much for affirming this description. In fact, I've read Cai's paper that you referenced. Because they proposed a speeding algorithm for LDA, there are some steps that are different from the steps I posted. They also discussed the cases $m>n$ and $m – mining Mar 21 '15 at 04:28