0

Say I have the model $Y_{i} = m(x_{i}) + \epsilon_{i}$ and $Y_{i}$ and $X_{i}$ are two mutually independent i.i.d. sequences.

Then, how can I show that the Nadaraya Watson estimator is unbiased for this model, regardless of the bandwidth? And what is the intuition behind that?

I know that $E(\epsilon_{i})=0$ and that by the LIE we get $E(E(\epsilon_{i}|X_{i}))=E(\epsilon_{i})=0$ and since we have an i.i.d. sequence it also holds that $E(\epsilon|X)=0$, i.e. for all i's. But how should I proceed?

Edit: The NW estimator given as follows:

$ \hat{m}(x)= \frac{\sum K(\frac{X_{i}-x}{h})Y_{i}}{\sum K(\frac{X_{i}-x}{h})}=\sum W_{i}(x)Y_{i}=\sum x(X'X)^{-1}X_{i}Y_{i} $

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
user34031
  • 1
  • 1
  • Which Nadaraya-Watson estimator do you mean? – Xi'an Feb 14 '21 at 20:12
  • I edited the question – user34031 Feb 18 '21 at 18:11
  • wecome to CV @user34031 a general approach for these problems is to substitute $Y_i = m(x_i) + \epsilon_i$ into your the expression that you added with the edit, then use linearity of expectations and the conditioning you showed. To get further in your derivations I think you'll need to explicitly define the $K(\cdot, \cdot)$ function, or make some assumptions e.g. use classes of kernels. – Lucas Roberts Feb 20 '21 at 15:44
  • Also, there are other posts here discussing *how* to choose kernel for NW you might find informative https://stats.stackexchange.com/questions/16753/which-kernel-function-for-watson-nadaraya-classifier?rq=1 but that doesn't have anything to say about proving unbiasedness – Lucas Roberts Feb 20 '21 at 15:57
  • I thought $m(x)=\beta_0$, i.e. a constant? If I plug this into $E(\hat m(x))=E(x(X'X)^{-1}X(\beta_0 + \epsilon))$ and use independence of the error term the second term vanishes but I do not end up with my estimator $m(x)$. – user34031 Feb 20 '21 at 16:08
  • @user34031 take a look at equation 6.13 here: https://bookdown.org/egarpor/PM-UC3M/npreg-kre.html. further down they do define it this way so this might vary depending on whom you speak with which I'm guessing is why there is some confusion. As regards your comment, You first condition on $X$ when you take the expectation with respect to $\epsilon$? – Lucas Roberts Feb 20 '21 at 17:08
  • Let's say I want to estimate $Y=\beta_0 + X\beta_1 + \epsilon$, then how would my NW estimator $\hat m(x)$ look like? As for your question: $E(\hat m(x))=E(E(x(X'X)^{-1}X(\beta_0 +\epsilon)|X))$, but this depends on whether I plugged in the correct $m(x)$. – user34031 Feb 23 '21 at 15:29

0 Answers0