2

After reading: What is the correct formula to between-class scatter matrix in LDA? I've been bugged with the balanced version of $W$. The original whithin-class scatter matrix is:

\begin{equation} W=\sum_i^k\sum_j^{N_i}(x_{ij}−\overline{x}_i)(x_{ij}−\overline{x}_i)^T \end{equation}

And the proposed balanced formula found in the answer there is:

\begin{equation} W'= \overline{n} \sum_i^k \frac{1}{N_i} \sum_j^{N_i}(x_{ij}−\overline{x}_i)(x_{ij}−\overline{x}_i)^T \end{equation}

where $\overline{n}=\sum n_i/k$ is the mean number of points per class. However, here, despite the formula proposed in the slide is the original (unbalanced) version, in the numerical example the following is used:

\begin{equation} W''= \sum_i^k \frac{1}{N_i} \sum_j^{N_i}(x_{ij}−\overline{x}_i)(x_{ij}−\overline{x}_i)^T \end{equation}

So to me is now not clear: is the term $\overline{n}$ in front of the formula necessary?

Renthal
  • 326
  • 1
  • 7
  • Which numerical example? On page 7? What makes you think they are using $W''$ there? – amoeba Aug 02 '16 at 09:58
  • Yes at page 7. I tried to repliacte the results shown in this example with a simple MATLAB script for the sake of learning. However I was not getting the same numbers (altough I proofecheked, I use same dataset and formulas). Therefore I looked at the relationship between my results and theirs, and found out that they use $W''$ instead of $W$ as they write. Also, for the between-classes their results are $2\cdot B$ using the formulas of your other answer. However, I tested, and the eigenvectors are always the same, with $W'$ or $W''$, so I wonde wether scaling is affecting results. – Renthal Aug 02 '16 at 14:54
  • I would have to check the computations in this example, but in general if you scale both $B$ and $W$ by the same amount then $W^{-1}B$ will stay the same. – amoeba Aug 02 '16 at 14:58

0 Answers0