Given a set of features $x_1,x_2,x_3, ... \in \mathbb{R}$ and output class variable $y \in \mathbb{R}$
I could do Naive Bayes using the independence assumption of $x_1, x_2, x_3, ...$ to predict the output class probability as:
$$p(y \mid x_1, x_2, x_3...) = \frac{P(y) P(x_1 \mid y) P(x_2 \mid y) P(x_3 \mid y) ...}{P(x_1) P(x_2) P(x3)...}$$
where I could build 1d histograms from my data to compute the probability density functions for the right hand side PDFs.
Or I could also consider $X = [x_1, x_2, x_3, ...]$ as a n-dimensional variable and use:
$$P(y \mid X) = \frac{p(y) P (X \mid y)}{P(X)}$$
where I can use something like kernel density estimation (KDE) to compute the right hand side PDFs.
It looks like I do not need to assume the independence of $x_1, x_2, x_3, ...$ for the second approach? Is that correct or is the independence assumption somehow absorbed into the KDE process? If not can the latter be called Naive Bayes anymore or would it be just Bayes?
Also what would be the pros and cons of either approach to compute the posterior probability of the class $y$?