3

I have been using the metric shown below for a while, in a power forecasting context. We call it WAPE (Weighted Average Percentege Error, but I have never seen any references in papers, just some questionable like blogs:

$WAPE = 100 \cdot \frac{\sum_i |y_i - \hat{y_i}|}{\sum_i y_i}$

where $y_I$ defines de observed value and $\hat{y_i}$ the predicted one.

Is there any name for it? Exist any reliable publication or reference?

EDIT

Found an entry on Wikipedia with this metric, named Weighted Mean Absolute Percentage Error (wMAPE):

$$ wMAPE = \frac{\sum_i |y_i - \hat{y_i}|}{\sum_i y_i} $$

r3v1
  • 31
  • 3
  • Besides a weighted error you can also view this as 'the average error divided by the average value'. Thus computing some kind of average relative error by *first* averaging and then taking the reciprocals. – Sextus Empiricus Oct 09 '21 at 12:51

2 Answers2

4

Some insight can be achieved by writing the WAPE explicitly as a weighted average.

Recall that with any data $x_i$ and corresponding weights $\omega_i$ (not summing to zero) the weighted average of the data is

$$\operatorname{WA}(\mathbf x, \mathbf \omega) = \frac{1}{\sum_i \omega_i}\,\sum_{i}\omega_i x_i.$$

(When all the $\omega_i$ are equal and nonzero, this equals the usual arithmetic mean.)

Comparing to the WAPE suggests we let $$x_i = 100\, \frac{y_i - \hat y_i}{y_i}$$ be the relative residuals (in percent) of the data $y_i$ with respect to their predicted values $\hat y_i$ and let the weights be $$\omega_i = y_i.$$ Assuming all the $y_i$ are positive, so that all these ratios are defined and $|y_i|=y_i,$ we may rewrite the WAPE as

$$\begin{aligned} \operatorname{WAPE}(\mathbf x, \mathbf \omega) &= 100 \frac{\sum_i|y_i-\hat y_i|}{\sum_i y_i} = \frac{1}{\sum_i y_i}\,\sum_i \left| 100\, y_i \frac{y_i - \hat y_i}{y_i} \right| = \frac{1}{\sum_i \omega_i}\, \sum_i \omega_i \left|x_i\right|\\ &=\operatorname{WA}(|\mathbf x|, \mathbf \omega). \end{aligned}$$

This reveals two things:

  1. Yes, the WAPE is a weighted average percent: the percents are the absolute residuals $\left|y_i - \hat y_i\right|$ relative to the data $y_i$ and the weights are the $y_i.$

  2. This is a little strange, because ordinarily one would express residuals relative to the fitted values, which would make the $\hat y_i$ the weights as well as the denominators of the $x_i.$

There are many reasons for (2). For instance, most valid models of non-negative data will guarantee $\hat y_i \ne 0$ even when the observations $y_i$ might be zero or even negative (think of what an additive error term can do to a positive value). As a result, it would be much more difficult to conduct a theoretical analysis of the statistical properties of your version of the WAPE in most situations. I therefore doubt one can find many applications in the literature or even a standard name for it.

There are a huge number of ways to measure relative errors. See https://stats.stackexchange.com/a/201864/919 for a brief introductory account.

whuber
  • 281,159
  • 54
  • 637
  • 1,101
  • Thank you very much for the detailed explanation. The thing is, I'm forced to use that weird metric in a electric market forecasting context: the forecasts are evaluated using that formula – r3v1 Oct 09 '21 at 11:46
0

Perhaps you are not finding much because this statistic is not very interesting in general. First, what happens when $y_i$ are not all positive? The denominator can become very small (for normal $y_i$) or even zero (for discrete $y_i$). Even if $y_i$ are positive by the law of large numbers the denominator will be close to $n \mu$ where $\mu$ is the mean of $y_i$, so basically the metric will be approximately mean absolute deviation divided by $\mu$.

Valentas
  • 101
  • 1
  • Your final comment is incorrect: when the $y_i$ vary considerably, the result can be quite different from the mean absolute deviation. – whuber Oct 08 '21 at 13:59