Is calculating a moving average a good way to approximate k-nearest neighbor regression?

Question

Given i.i.d samples (x₁,y₁), ... (x_n,y_n) such that y_i = f₀(x_i) + $\epsilon$_i, i = 1,... n for some f₀

Suppose I want an estimate $\hat{f}$ of f₀ using k-nearest-neighbors regression in the neighborhood of each x_i in my dataset. So for each x_i, I must search for the k nearest neighboring elements and take the average of the set of all y_j such that j $\in$ $\mathcal{N}$_k(x_i) where $\mathcal{N}$_k(x) contains the k nearest points of x:

$$\hat{f}(x_i) = \frac{1}{k}\sum_{j\in\mathcal{N}_k(x_i)} y_j$$

Now if my x_i are all evenly spaced, then I could simply sort them in ascending order and calculate a moving average over corresponding elements in y with window size k. My question is: Will this moving average be approximately equivalent to k-nearest neighbors regression even if (x₁, ... x_n) are not evenly spaced? Are there any tests I can do on the distribution P(x) to check the quality of approximation?

*Moving average model* is a fixed notion which is quite different from *moving average* in general – see [this](https://stats.meta.stackexchange.com/questions/4953/confusing-moving-average-tag-split-into-two) and perhaps edit the title. — Richard Hardy, Oct 24 '17 at 06:13
Yes, sorry I left that detail out. $x_i$ will be the middle element/median of the window. Of course, if ($x_1$, ... $x_n$) are evenly spaced, then $x_i$ will also be the mean of the window. — Moss Murderer, Oct 24 '17 at 07:56
Is the endgoal to gain some computation speed? Are there a lot of data points? I mean why do want to avoid the nearest neighbor? — Cagdas Ozgenc, Oct 24 '17 at 07:58
Yes. There are 300,000 observations in my dataset so my question was partly motivated by the need to speed it up. However, I am curious to know if I can use moving average as a general strategy because the results looked very similar - at least in my one dataset. — Moss Murderer, Oct 24 '17 at 08:04

Is calculating a moving average a good way to approximate k-nearest neighbor regression?

0 Answers0