Should PCA be (always) done before Naive Bayes classification

Question

According to Wikipedia page on Naive Bayes:

.. Naive Bayes classifiers are a family of simple "probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features.

Since data features may not be independent of each other, should one always perform PCA before applying Naive Bayes? PCA is expected to create components which are not much correlated with each other and hence one can expect more robust results with Naive Bayes.

"PCA is expected to create components which are **not much** correlated with each other.." - do principal components are supposed to have any correlation? — hafiz031, Mar 01 '22 at 04:02
You should ask this as a separate question. Also post the link of that question here. — rnso, Mar 01 '22 at 07:13
@mso probably I have found what I was looking for: https://stats.stackexchange.com/q/153928/245577 and https://stats.stackexchange.com/q/110508/245577 show PCs **should** be uncorrelated. — hafiz031, Mar 03 '22 at 07:18

Vickyyy · Accepted Answer · 2018-11-16T05:51:29.533

For general cases, I don't think doing PCA first will improve the classification results for the Naive Bayes classifier. Naive Bayes assumes the features are conditional independent, which means given the class, $p(x_i|C_k)=p(x_{i}|x_{i+1}...x_n,C_k)$, this does not mean that the features have to be independent.

Moreover, I don't think PCA can improve the conditional independence in general. Using PCA without dimension reduction is just doing coordinate rotation, without taken into account the discrimination power between different class. And in most of the cases this rotation won't give uncorrelated features for each class, as shown in this following figure. And using PCA to do dimension reduction, this might even worse the situation when the feature with discrimination power has small variance and is threw away by doing PCA first.

Your last objection re - "feature with small variance being thrown away" - can possibly be overcome by scaling before PCA? — rnso, Sep 23 '20 at 16:19

Should PCA be (always) done before Naive Bayes classification

1 Answers1