Gaussian Kernel and Feature Space

Question

I have been reading this paper for a few days. There is one section (Section 3.3) that confuses me.

We start by gathering local features from training images of a particular class into a single set. Then for every local feature, a Gaussian kernel is placed in the feature space with its mean at the feature. The probability density function (PDF) of the class is then defined as the normalized sum of all the kernels.

To simplify the discussion, we can assume we have $N$ features with each feature has dimensionality 128. Now, I am lost when the author said

a Gaussian kernel is placed in the feature space with its mean at the feature

It seems that this means using kernel method but it also seems the author wants to use kernel density estimator, and what exactly does with its mean at the feature means?

Any suggestions to clarify these confusions are welcome!

This seems to be a description of kernel density estimation, not the kernel trick. The author is just describing how to make a KDE from this particular source of data. — Sycorax, Sep 17 '19 at 13:34
@Sycorax Thanks for your comment. Yes, I believe it is KDE as well. But I am still unclear about `with its mean at the feature`. As far as I know, the only parameter for KDE is the bandwidth, please see [here Section 2.2](https://stat.ethz.ch/education/semesters/SS_2006/CompStat/sk-ch2.pdf). I am not sure what role is `mean` playing here. — yiping, Sep 17 '19 at 16:39
In general the kernel density function is $K\left(\frac{x - x_i}{h}\right),$ where $h$ is the bandwidth. The function $K_i(x)=\exp((x-x_i)^2/h)$ has a maximum at $x_i$. For some reason, the author calls $x_i$ the "mean." If you use the feature value as $x_i$ for each data point $i$, scale it, and take the sum over $i$, you have a KDE with a Gaussian kernel: $f(x) \propto \sum_i K_i(x)$. Does this answer your question or are you asking something else? — Sycorax, Sep 17 '19 at 16:50
@Sycorax. That's what I am asking! Thanks for your clarification. This paper is from 2005 so it might used some strange terminologies. If you have more time, I am also confused when the author said `We set the covariance of the kernels using Parzen Windows` (In the next paragraph of the paper). I believe Parazen Windows is KDE (from Wikipedia), so I am not sure what is he referring to. — yiping, Sep 17 '19 at 17:05

Gaussian Kernel and Feature Space

0 Answers0