Let's say you have some Real-valued features $\mathbf{X}$ and Real-valued univariate responses $\mathbf{y}$. We want to fit a regression model to this data:
$$\mathbf{y} = f\left(\mathbf{X};\beta\right) + \mathbf{\varepsilon}$$
Where fitting is done by minimizing some loss function $L(f(x;\beta),y)$:
$$\hat{\beta} = \min_{\beta} \sum_i L(f(x_i;\beta),y_i)$$
I noticed that although $L$ is a function of $\mathbf{X}$ and the true responses $\mathbf{y}$, it is not sensitive to the distribution of $\mathbf{y}$ or $\mathbf{X}$. The loss assigned to each point does not appear to depend on the distribution of points around it.
I'd like to weight the loss for each training point by a scalar $s_i$ to reflect the "novelty" of each $y_i \in \mathbf{y}$. I was thinking of using a 1-d Voronoi tessellation on $\mathbf{y}$ (with the extreme points serving as outer bounds/convex hull for the tessellation) and setting $s_i$ to be the length of the Voronoi cell associated with $y_i$, such that the new, weighted loss function becomes:
$$L_w(f(x_i;\beta),y_i) = s_iL(f(x_i;\beta),y_i)$$
I'd be appreciative of any references to the statistical properties of this approach, whether applied to the response space (as above) or to the feature space (provided s suitable Voronoi tessellation can be developed for the feature space)