LDA for dimensionality reduction usage

Question

I have a original dataset with 70 samples, each sample with 96 features. The samples are labeled as 0 or 1. So I use linear discriminant analysis (LDA) to reduce the dimensionality of all the dataset, generating a samples with only one feature.

My results with all 96 features is 83% of accuraccy, with the projeted samples i have a 100% of accuracy. I'm using a svm for classification, with a split of 80% for training and 20% for test.

So my questions is : LDA is known as a supervised method of classification, but often used as a dimensionality reduction technique. The usage of LDA in all samples is doing a pre-training in the data? If yes, did I have to split the data before use LDA for reduction and later use the transformation to project the test data?

That's as with any modeling. If you want to test on a test dataset you train (model, extract; then classify if you want) the discriminant functions on the train dataset. Then, having their coefficients you compute the functions' values in the test dataset and perform classification by them there. You can then compare classification accuracy in train and test sets. — ttnphns, Feb 25 '16 at 07:40
You say you are `using a svm for classification`. But what classification you are using in LDA? LDA uses gaussian linear classifier (a Bayes classifier). You should not mix the methods. If you are comparing classifications as done by all the features and by just discriminants you should use one type of classifier in both cases. Another question to you: I wonder how you managed to run LDA on `n=70 < p=96` singular data? — ttnphns, Feb 25 '16 at 07:48
I'm not using LDA for classification. I'm using it for dimension reduction only. — Caio Belfort, Feb 25 '16 at 21:36
Can I know that in the context of dimensionality reduction using LDA/FDA. `LDA/FDA can start with n dimensions and end with k dimensions, where k < n`. Is that correct? Or The output is `c-1 where c is the number of classes and the dimensionality of the data is n with n>c.` — aan, May 06 '20 at 21:23
@aan please ask a separate question instead of asking by commenting all posts... — Matthieu, May 07 '20 at 11:09
@Matthieu thanks. I added here https://stats.stackexchange.com/questions/464734/is-linear-discriminant-analysis-fisher-discriminant-analysis-only-generate-2-o you can answer here I can select your answer — aan, May 07 '20 at 15:04

score 2 · Accepted Answer · answered Feb 29 '16 at 17:05

2

LDA used as a dimensionality-reducing technique can be seen as a "supervised PCA", so it will redistribute your data in a new space (of lesser dimension) where classes should be better separated (based on the labels you provided).

The projection matrix is made of the first eigen vectors (of positive eigen values) given by LDA to project your test data into that new feature space, then input that vector into your SVM.

Note that you should use a non-linear kernel in your SVM (e.g. RBF), otherwise you'll have a linear transformation on top of another linear transformation, which will not improve discrimination. SVM and LDA are pretty much equivalent when it comes to linear classification.

answered Feb 29 '16 at 17:05

Matthieu

318
3
13

em... but LDA classify using the cluster center of each class while SVM uses support vectors which are the boundary samples. – Xiaoxiong Lin Oct 30 '18 at 13:15
@XiaoxiongLin for classification, yes. But LDA can also be used to reduce the number of dimensions, by maximizing the distance between clusters centers $B$ while minimizing the intra-cluster covariance $W$ (so to speak. i.e. minimizing $W/B$ ratio). – Matthieu Oct 30 '18 at 15:18
@Matthieu Can I know that in the context of dimensionality reduction using LDA/FDA. `LDA/FDA can start with n dimensions and end with k dimensions, where k < n`. Is that correct? Or The output is `c-1 where c is the number of classes and the dimensionality of the data is n with n>c.` – aan May 06 '20 at 21:24
@aan both are correct because $c < n$. Just mind that eigenvalues for dimensions $> c-1$ will be $0$ or imaginary, so you won't get more than $c-1$ dimensions really... – Matthieu May 07 '20 at 11:11
@Matthieu Thanks. Let say my original dataset has 2 classes, the output will be 1 dimensionality ( 2 – 1 =1), likewise, if my original dataset has 5 classes, the output will be 4 dimensionality. – aan May 07 '20 at 15:01
@aan that's correct. – Matthieu May 08 '20 at 10:25
@Matthieu Thanks. I can chooce any output I want for LDA, but the problem is the eigenvalues for dimensions `> C - 1` will be imaginary or zero which is no meaning. – aan May 08 '20 at 10:55
@aan yes, that's the point of the process: have lesser dimensions to make it easier to interpret for people... – Matthieu May 08 '20 at 19:03
@Matthieu Thanks. I am very curious of this. Is it correct that `original dataset has 2 classes, the output of LDA/FDA will be 1 dimensionality ( 2 – 1 =1)` – aan May 08 '20 at 19:13

LDA for dimensionality reduction usage

1 Answers1

Linked