Linear Discriminant Analysis (LDA) and Fisher Linear Discriminant Analysis (FLDA) both project high-dimensional observations to univariate classification scores using different rationals and assumptions. For simplicity, I'm here considering the two-class case.
LDA assumes that the observations are normally distributed around the classes' expectancies with homoscedastic covariances. The weight vector that projects the observations into unidimensional classification scores is derived from the conditional probabilities of the observations under this model. The Wikipedia page on LDA specifies it as:
$$ \vec w = \Sigma^{-1} (\vec \mu_1 - \vec \mu_0) $$
FLDA defines a weight vector that projects the multivariate observations to univariate classification scores such that the ratio between between-class variance and the within-class variance is maximal. The same Wikipedia article specifies it as: $$ \vec w \propto (\Sigma_0+\Sigma_1)^{-1}(\vec \mu_1 - \vec \mu_0) $$
Immediately following the specification of the latter formula (the FLDA weight vector), the Wikipedia article states:
"When the assumptions of LDA are satisfied, the above equation is equivalent to LDA. "
However, since $\Sigma=\frac{1}{2}(\Sigma_0+\Sigma_1)$ (pooled covariance is a weighted average of within class covariances), these two weight vectors always point at the same direction, regardless of whether the assumptions (normality, homoscedasticity) hold.
Is the Wikpedia article wrong? Do LDA and FLDA always yield the same solution with respect to the weight vector's direction? Or am I missing some special case?