Questions tagged [kernel-smoothing]

Kernel smoothing techniques, such as kernel density estimation (KDE) and Nadaraya-Watson kernel regression, estimate functions by local interpolation from data points. Not to be confused with [kernel-trick], for the kernels used e.g. in SVMs.

A kernel in the context of kernel smoothing is a local similarity function $K$, which must integrate to 1 and is typically symmetric and nonnegative. Kernel smoothing uses these functions to interpolate observed data points into a smooth function.

For example, Watson-Nadaraya kernel regression estimates a function $f : \mathcal X \to \mathbb R$ based on observations $\{ (x_i, y_i) \}_{i=1}^n$ by $$ \hat{f}(x) = \frac{\sum_{i=1}^n K(x, x_i) \, y_i}{\sum_{i=1}^n K(x, x_i)} ,$$ i.e. a mean of the observed data points weighted by their similarity to the test point.

Kernel density estimation estimates a density function $\hat{p}$ from samples $\{ x_i \}_{i=1}^n$ by $$ \hat{p}(x) = \frac{1}{n} \sum_{i=1}^n K(x, x_i) ,$$ essentially placing density "bumps" at each observed data point.

The choice of kernel function is of theoretical importance but typically does not matter much in practice for estimation quality. (Wikipedia has a table of the most common choices.) Rather, the important practical problem for kernel smoothing methods is that of bandwidth selection: choosing the scale of the kernel function. Undersmoothing or oversmoothing can result in extremely poor estimates, and so care must be taken to choose an appropriate bandwidth, often via cross-validation.

Note that the word "kernel" is also used to refer to the kernel of a reproducing kernel Hilbert space, as in the "kernel trick" common in support vector machines and other kernel methods. See [kernel-trick] for this usage.

575 questions

votes

2 answers

What is a "kernel" in plain English?

There are several distinct usages: kernel density estimation kernel trick kernel smoothing Please explain what the "kernel" in them means, in plain English, in your own words.

kernel-trick kernel-smoothing

asked Sep 09 '10 at 00:15

Neil McGuigan

9,292
13
54
62

votes

4 answers

Good methods for density plots of non-negative variables in R?

plot(density(rexp(100)) Obviously all density to the left of zero represents bias. I'm looking to summarize some data for non-statisticians, and I want to avoid questions about why non-negative data has density to the left of zero. The plots are…

r density-function gamma-distribution kernel-smoothing

asked Jul 29 '13 at 06:57

generic_user

11,981
8
40
63

votes

2 answers

Can you explain Parzen window (kernel) density estimation in layman's terms?

Parzen window density estimation is described as $$ p(x)=\frac{1}{n}\sum_{i=1}^{n} \frac{1}{h^2} \phi \left(\frac{x_i - x}{h} \right) $$ where $n$ is number of elements in the vector, $x$ is a vector, $p(x)$ is a probability density of $x$, $h$ is…

density-function kernel-smoothing intuition density-estimation

asked Nov 03 '16 at 14:30

user366312

1,464
3
14
34

votes

1 answer

"Kernel density estimation" is a convolution of what?

I am trying to get a better understanding of kernel density estimation. Using the definition from Wikipedia: https://en.wikipedia.org/wiki/Kernel_density_estimation#Definition $ \hat{f_h}(x) = \frac{1}{n}\sum_{i=1}^n K_h (x - x_i) \quad =…

r kernel-smoothing convolution

asked Oct 23 '13 at 19:36

Tal Galili

19,935
32
133
195

votes

2 answers

Choosing a bandwidth for kernel density estimators

For univariate kernel density estimators (KDE), I use Silverman's rule for calculating $h$: \begin{equation} 0.9 \min(sd, IQR/1.34)\times n^{-0.2} \end{equation} What are the standard rules for multivariate KDE (assuming a Normal kernel).

smoothing kernel-smoothing

asked Jul 19 '10 at 23:26

csgillespie

11,849
9
56
85

votes

2 answers

If the Epanechnikov kernel is theoretically optimal when doing Kernel Density Estimation, why isn't it more commonly used?

I have read (for example, here) that the Epanechnikov kernel is optimal, at least in a theoretical sense, when doing kernel density estimation. If this is true, then why does the Gaussian show up so frequently as the default kernel, or in many…

nonparametric kernel-smoothing

asked Jun 01 '16 at 21:30

John Rauser

votes

1 answer

What does the y axis in a kernel density plot mean?

Possible Duplicate: Probability distribution value exceeding 1 is OK? I thought the area under the curve of a density function represents the probability of getting an x value between a range of x values, but then how can the y-axis be greater…

r distributions density-function kernel-smoothing

asked Jan 20 '13 at 01:39

nachocab

votes

2 answers

If variable kernel widths are often good for kernel regression, why are they generally not good for kernel density estimation?

This question is prompted by discussion elsewhere. Variable kernels are often used in local regression. For example, loess is widely used and works well as a regression smoother, and is based on a kernel of variable width that adapts to data…

nonparametric smoothing kernel-smoothing loess

asked Oct 19 '10 at 11:35

Rob Hyndman

51,928
23
126
178

votes

4 answers

How to calculate overlap between empirical probability densities?

I'm looking for a method to calculate the area of overlap between two kernel density estimates in R, as a measure of similarity between two samples. To clarify, in the following example, I would need to quantify the area of the purplish overlapping…

r probability density-function kernel-smoothing

asked May 14 '14 at 00:05

mmk

votes

1 answer

Kernel Bandwidth: Scott's vs. Silverman's rules

Could anyone explain in plain English what the difference is between Scott's and Silverman's rules of thumb for bandwidth selection? Specifically, when is one better than the other? Is it related to the underlying distribution? Number of…

kernel-smoothing

asked Mar 20 '14 at 01:41

xrfang

votes

1 answer

What is the long run variance?

How is long run variance in the realm of time series analysis defined? I understand it is utilized in the case there is a correlation structure in the data. So our stochastic process would not be a family of $X_1, X_2 \dots$ i.i.d. random variables…

time-series variance references kernel-smoothing non-independent

asked May 21 '15 at 20:12

Monolite

1,141
3
13
24

votes

1 answer

How to draw random samples from a non-parametric estimated distribution?

I have a sample of 100 points which are continuous and one-dimensional. I estimated its non-parametric density using kernel methods. How can I draw random samples from this estimated distribution?

r sampling kernel-smoothing

asked Jan 20 '14 at 07:53

lovekesh

votes

3 answers

Where is density estimation useful?

After going through some slightly terse mathematics, I think I have a slight intuition of kernel density estimation. But I am also aware that estimating multivariate density for more than three variables might not be a good idea, in terms of the…

nonparametric density-function kernel-smoothing bivariate density-estimation

asked Jan 17 '14 at 11:37

lovekesh

votes

2 answers

Area under the "pdf" in kernel density estimation in R

I am trying to use the 'density' function in R to do kernel density estimates. I am having some difficulty interpreting the results and comparing various datasets as it seems the area under the curve is not necessarily 1. For any probability density…

r estimation density-function kernel-smoothing auc

asked Aug 09 '11 at 23:19

highBandWidth

2,092
2
21
34

votes

1 answer

Is there an optimal bandwidth for a kernel density estimator of derivatives?

I need to estimate the density function based on a set of observations using the kernel density estimator. Based on the same set of observations, I also need to estimate the first and second derivatives of the density using the derivatives of the…

r nonparametric density-function kernel-smoothing

asked Aug 08 '12 at 12:06

user13154

2 3

…

38 39 Next