1

Suppose $Y_i=g(X_i)+e_i$ with $E(e_i|X_i)=0$, $g(\cdot)$ being an unknown function and $X_i\in S=\{1,2,3,4\}$ with equal probability of taking each value. We want to estimate $g(x)$ using data $\{Y_i,X_i\}_{i=1}^{n}$. Can we estimate $g(x)$ with the Nadaraya–Watson estimator

$\widehat{g}(x)=\frac{\sum_{i=1}^{n}Y_iK_{h}(X_i-x)}{\sum_{i=1}^{n}K_{h}(X_i-x)}$,

where $x\in S$ and $K_{h}(\cdot)=\frac{1}{h}k(\cdot)$ and $k(\cdot)$ is some standard second order kernel?

More specifically, is $\widehat{g}(x)$ still consistent for $g(x)$? Thanks!

My guess is it's still consistent, as eventually bandwidth shrinks and we still put all weight on point $x$, which is the same as in the case when $x$ is continuous.

T34driver
  • 1,608
  • 5
  • 11
  • 1
    This degenerates to the trivial case. There are 4 values g(1)...g(4) to be estimated and the Nadaraya-Watson estimator is simply the conditional sample mean of Y with respect to X. Sample means are trivially consistent. – Michael Aug 16 '20 at 20:14
  • @Michael Right, I guess asymptotically they might be the same. But in finite sample, they could be numerically the same or different depending on the bandwidth and kernel. For example, if $x=2$ and I set my bandwidth to be 3 here and suppose the kernel is positive on [-1,1], then Nadaraya-Watson estimator will also use observations with Xi=1,3,4. While sample mean only uses observations with Xi=2. – T34driver Aug 16 '20 at 21:11
  • If the distribution of X is supported on a finite discrete set, there is no reason to choose the bandwidth so that they overlap. – Michael Aug 16 '20 at 21:59

0 Answers0