How are the coefficients of linear discriminants calculated in lda()?

Question

I am trying to understand how the coefficients of linear discriminants are calculated in lda(). Consider the following data set.

library(MASS)
S<-matrix(c(2,.5,.5,1),2,2)
set.seed(1)
X<-data.frame(rbind(mvrnorm(25,c(0,0),S),mvrnorm(25,c(3,2),S)),Class=c(rep("First",25),rep("Second",25)))
lda.fit<-lda(Class~X1+X2,data=X)

lda.fit contains the following data.

Call:
lda(Class ~ X1 + X2, data = X)

Prior probabilities of groups:
 First Second 
   0.5    0.5 

Group means:
               X1         X2
First  -0.2205177 -0.1224064
Second  2.7965638  1.8489960

Coefficients of linear discriminants:
         LD1
X1 0.3476010
X2 0.7330707

It seems that the vector of coefficients should be calculated using the formula $$ \bf w\propto{\bf S}_W^{-1}({\bf m}_2-{\bf m}_1), $$ where ${\bf S}_W^{-1}$ is the inverse of the pooled covariance matrix, ${\bf m}_2$ and ${\bf m}_1$ are the sample means of the groups (the formula comes from page 189 of Pattern Recognition and Machine Learning by Christopher M. Bishop).

Sh<-((25-1)*cov(X[1:25,1:2])+(25-1)*cov(X[26:50,1:2]))/(50-2)
w<-solve(Sh)%*%(lda.fit$means[2,]-lda.fit$means[1,])

w is equal to

        [,1]
X1 0.8668882
X2 1.8282180

and this does not coincide with the results in lda.fit. However, both of these vectors (w and coef(lda.fit)) have the same direction. w is a scaled version of coef(lda.fit) and vice versa.

Could someone explain how the coefficients of linear discriminants are calculated? How is the scaling factor chosen for coef(lda.fit)?

Any help is much appreciated!

Did you search the site? There is at least about a ten of answers explaining how discriminant coefficients are computed. — ttnphns, Jul 06 '19 at 07:12
@ttnphns That is exactly what I did. I was not able to find an answer that explains how exaclty `lda()` calculates the coefficients of linear discriminants. I guess there might different ways, but I would like to know how `lda()` does that. If you could post a link to answer that explains this, I would be very grateful. — Cm7F7Bb, Jul 06 '19 at 07:19
Sorry, lazy to search all the good answers, but here is one, my own, showing how discriminants are extracted in the general case of >=2 classes. https://stats.stackexchange.com/q/48786/3277 — ttnphns, Jul 06 '19 at 07:29

How are the coefficients of linear discriminants calculated in lda()?

0 Answers0