Probabilistic classification using kernel density estimation

Question

Assuming $X$ is a data set represented in form of a matrix. Each row represents an instance of the data with every instance consisting of values $x_1,...,x_n$ as the attribute values and a class association $y$ (possible class labels are $y_1$ and $y_2$).

From a set of unlabeled instances, i.e. instances consisting only of values for $x_1,...,x_n$ , each instance should get assigned to the most probable class.

To do this, I have fit a Kernel Density Estimator using a gaussian kernel on the subsets of $X$ that are classified as either $y_1$ or $y_2$ , written as $X_{y_1}$ and $X_{y_2}$ and on the whole data, written as $X_{y}$.

To get the posterior probability of an instance $\overrightarrow{x}$ belonging to class $y_1$ I would compute

$p(y_1|\overrightarrow{x}) = \frac{KDE_{X_{y_1}}.score(\overrightarrow{x})}{KDE_{X_{y}}.score(\overrightarrow{x})}$

where 'score' refers to a function in sklearn.kde

However, this approach doesn't seem to be right, since some of the posterior estimates that I get by using it are $>1$ .

Where did I go wrong with this approach? What would be the correct approach to probabilistic classification with KDEs ?

Note: If there's an error in the question or additional information is required, please leave a comment and I will try to edit my question.

See http://stats.stackexchange.com/questions/82797/how-to-draw-random-samples-from-a-non-parametric-estimated-distribution — Sean Easter, Dec 11 '15 at 16:24

score 2 · Accepted Answer · edited Dec 17 '17 at 00:49

2

Not quite sure about it, but it seems you made a mistake in the computation of the posterior probability

I'll try to adapt your notation in the following in order to make it more clear

$p(y_1|\overrightarrow{x}) = \frac{KDE_{X_{y_1}}.score(\overrightarrow{x})}{\sum_{i=1}^{|Y|}KDE_{X_{y_i}}.score(\overrightarrow{x})}$

Found this pdf while looking into the problem, see page 27.

edited Dec 17 '17 at 00:49

kjetil b halvorsen

63,378
26
142
467

answered Dec 12 '15 at 15:49

deemel

2,402
4
20
37

Probabilistic classification using kernel density estimation

1 Answers1