6

I've got a graph of RMSE% vs. unit size and it declines nicely. Is this scale-dependence or does the "%" compensate for that?

$$ \text{RMSE%} = 100\% \cdot \frac{\sqrt{\frac{1}{n}\Sigma_{i=1}^n (y_i - \hat{y}_i)^2}}{\bar{y}} $$

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
J Kelly
  • 507
  • 3
  • 13

1 Answers1

7

A function $f(\cdot)$ is scale-invariant if it yields the same result for argument $x$ as it does for argument $cx$, where $c$ is some positive constant. Let us see whether supplying $(cy_i,c\hat{y}_i)$ in place of $(y_i,\hat{y}_i)$ for $i=1,\dotsc,n$ will change the value of $\text{RMSE%}$:

$$ \begin{equation} \begin{aligned} \text{RMSE%}(cy_i,c\hat{y}_i) &= 100\% \cdot \frac{\sqrt{\frac{1}{n}\Sigma_{i=1}^n (cy_i - c\hat{y}_i)^2}}{c\bar{y}} \\ &= 100\% \cdot \frac{\sqrt{c^2\frac{1}{n}\Sigma_{i=1}^n (y_i - \hat{y}_i)^2}}{c\bar{y}} \\ &= 100\% \cdot \frac{c\sqrt{\frac{1}{n}\Sigma_{i=1}^n (y_i - \hat{y}_i)^2}}{c\bar{y}} \\ &= 100\% \cdot \frac{\sqrt{\frac{1}{n}\Sigma_{i=1}^n (y_i - \hat{y}_i)^2}}{\bar{y}} = \text{RMSE%}(y_i,\hat{y}_i) \end{aligned} \end{equation} $$

On the way I used the obvious property that $\bar{y}$ gets multiplied by $c$ when $y_i$ is replaced by $cy_i$: if $\bar{y}=\frac{1}{n}\sum_{i=1}^n y_i$, then when scaled by $c$ you get $\frac{1}{n}\sum_{i=1}^n (cy_i)=c\frac{1}{n}\sum_{i=1}^n y_i=c\bar{y}$.

You see that $\text{RMSE%}$ yields the same result for $(cy_i,c\hat{y}_i)$ as it does for $(y_i,\hat{y}_i)$; hence, it is scale invariant.

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
  • Thank you, and I understand the very clear work you've shown here. Does this rest on the assumption that the rescaling preserves the variance (i.e., that nothing is going on beyond multiplication by a constant)? The context is spatial aggregation by averaging. Between-unit variance in the x's declines with unit size. – J Kelly Jan 16 '16 at 13:29
  • And n is lower. – J Kelly Jan 16 '16 at 13:33
  • Well, scaling $x$ to $cx$ for some positive constant $c$ yields the following scaling in variance: $\text{Var}(x)=\sigma^2 \rightarrow \text{Var}(cx)=c^2\sigma^2$. Regarding $\bar{y}$: $\bar{y}=\frac{1}{n}\sum_{i=1}^n y_i$, hence when scaled by $c$ you get $\frac{1}{n}\sum_{i=1}^n (cy_i)=c\frac{1}{n}\sum_{i=1}^n y_i=c\bar{y}$. I don't know how all this should be interpreted in the context of spatial aggregation by averaging, though. – Richard Hardy Jan 16 '16 at 13:42
  • Ok, I understand. My context is different. Unit size doubles, x gets rescaled to its 4-unit neighborhood average, y gets rescaled to its 4-unit neighborhood sum. It's just a very different setup. – J Kelly Jan 16 '16 at 13:47
  • In such a setting, the scaling is not so trivial. It looks like some kind of smoothing. Maybe asymptotically it is still fine and RMSE% is roughly "scale invariant" (as per you setting of scaling), but that needs some more thought. – Richard Hardy Jan 16 '16 at 13:50