why is the nadaraya watson estimator unbiased?

Question

Say I have the model $Y_{i} = m(x_{i}) + \epsilon_{i}$ and $Y_{i}$ and $X_{i}$ are two mutually independent i.i.d. sequences.

Then, how can I show that the Nadaraya Watson estimator is unbiased for this model, regardless of the bandwidth? And what is the intuition behind that?

I know that $E(\epsilon_{i})=0$ and that by the LIE we get $E(E(\epsilon_{i}|X_{i}))=E(\epsilon_{i})=0$ and since we have an i.i.d. sequence it also holds that $E(\epsilon|X)=0$, i.e. for all i's. But how should I proceed?

Edit: The NW estimator given as follows:

$ \hat{m}(x)= \frac{\sum K(\frac{X_{i}-x}{h})Y_{i}}{\sum K(\frac{X_{i}-x}{h})}=\sum W_{i}(x)Y_{i}=\sum x(X'X)^{-1}X_{i}Y_{i} $

wecome to CV @user34031 a general approach for these problems is to substitute $Y_i = m(x_i) + \epsilon_i$ into your the expression that you added with the edit, then use linearity of expectations and the conditioning you showed. To get further in your derivations I think you'll need to explicitly define the $K(\cdot, \cdot)$ function, or make some assumptions e.g. use classes of kernels. — Lucas Roberts, Feb 20 '21 at 15:44
Also, there are other posts here discussing *how* to choose kernel for NW you might find informative https://stats.stackexchange.com/questions/16753/which-kernel-function-for-watson-nadaraya-classifier?rq=1 but that doesn't have anything to say about proving unbiasedness — Lucas Roberts, Feb 20 '21 at 15:57
I thought $m(x)=\beta_0$, i.e. a constant? If I plug this into $E(\hat m(x))=E(x(X'X)^{-1}X(\beta_0 + \epsilon))$ and use independence of the error term the second term vanishes but I do not end up with my estimator $m(x)$. — user34031, Feb 20 '21 at 16:08
@user34031 take a look at equation 6.13 here: https://bookdown.org/egarpor/PM-UC3M/npreg-kre.html. further down they do define it this way so this might vary depending on whom you speak with which I'm guessing is why there is some confusion. As regards your comment, You first condition on $X$ when you take the expectation with respect to $\epsilon$? — Lucas Roberts, Feb 20 '21 at 17:08
Let's say I want to estimate $Y=\beta_0 + X\beta_1 + \epsilon$, then how would my NW estimator $\hat m(x)$ look like? As for your question: $E(\hat m(x))=E(E(x(X'X)^{-1}X(\beta_0 +\epsilon)|X))$, but this depends on whether I plugged in the correct $m(x)$. — user34031, Feb 23 '21 at 15:29

why is the nadaraya watson estimator unbiased?

0 Answers0