Let $(y_i,x_i,b_i)$ be data at hand, where $y_i$ is a response variable, $x_i$ is covariates, and $b_i$ is an indicator for missing: if 1, then $y_i$ is observable, 0 otherwise. Then, under missing at random --$p(b=1|x_i,y_i) = p(b=1|x_i)$-- nonparametric local regression at $x$ is obtained by $$ \hat{m} (x) = \arg \min_\theta \sum_i b_i (y_i- \theta)^2 K(x-x_i/h)$$
where $K()$ is a kernel, $h$ converges to 0, and $nh$ converges to infinity.
I wonder why $\hat{m}(x)$ is a consistent estimator for $m(x)$ and how this estimator may be inconsistent if missing at random assumption fails to hold.