9

I have a time series that deals with rainfall. It is a period of 10 years (daily resolution), and covers climate variables.

I'm going to feed the data into an Artificial Neural Network to predict the rainfall variable (PP).

As what I've been reading, MAPE's formula involves dividing by the actual observed value. But since its rainfall, there will be days with little or zero precipitation values.

This is bad (dividing by zero = black hole). So how am I going to go about this? I could do data replacement on the zero or close to zero values, but that's stupid - if I do that, I inflate a lot of things, and am pretty much tampering with the data in a way (unlike missing values, which should be imputed by way of other data and not filled in with some other arbitrary value).

My professor is stubborn as a mule. Is there any alternative to MAPE? Or are there any methods to circumvent the issues of MAPE?

EDIT

THERE ARE SMALL AND ZERO VALUES IN THE DATASET... Am I just screwed now?

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
ace_01S
  • 295
  • 1
  • 2
  • 9
  • Check out F.X.Diebold's free textbook ["Forecasting in Economics, Business, Finance and Beyond"](http://www.ssc.upenn.edu/~fdiebold/Teaching221/Forecasting.pdf), Chapter 10 "Point forecast evaluation". You will find mean squared error, mean absolute error, predictive $R^2$ and Theil's $U$ statistic. Another measure could be mean absolute scaled error. – Richard Hardy May 19 '17 at 08:36
  • Is there really no other circumvention that can lead me to still use MAPE? I know of these alternatives, I've read up on a few of them. But I wanted an insider's opinion on this matter. Am I just gonna have to give my professor the bird and use another error measurement? – ace_01S May 19 '17 at 08:39
  • Look at Tim's answer and show the references to your professor. Hopefully that will be convincing enough. If not, ask him how he thinks MAPE should be calculated when the true values are zeros. – Richard Hardy May 19 '17 at 08:57
  • See https://en.wikipedia.org/wiki/Mean_absolute_scaled_error – Rob Hyndman May 19 '17 at 09:55
  • Everybody seems to know it, I didn't: MAPE stands for [mean absolute percentage error](https://en.wikipedia.org/wiki/Mean_absolute_percentage_error) – normanius Feb 13 '20 at 23:04

1 Answers1

13

No, actually MAPE is very poor error measure as discussed by Stephan Kolassa in Best way to optimize MAPE and Prediction Accuracy - Another Measurement than MAPE and Minimizing symmetric mean absolute percentage error (SMAPE) and on those slides. You can also check the following paper:

Tofallis, C. (2015). A better measure of relative prediction accuracy for model selection and model estimation. Journal of the Operational Research Society, 66(8), 1352-1362.

It is also discussed by Goodwin and Lawton (1999) in the On the asymmetry of the symmetric MAPE paper

Despite its widespread use, the MAPE has several disadvantages (Armstrong & Collopy, 1992; Makridakis, 1993). In particular, Makridakis has argued that the MAPE is asymmetric in that ‘equal errors above the actual value result in a greater APE than those below the actual value’. Similarly, Armstrong and Collopy argued that ‘the MAPE ... puts a heavier penalty on forecasts that exceed the actual than those that are less than the actual. For example, the MAPE is bounded on the low side by an error of 100%, but there is no bound on the high side’.

The quoted (Makridakis, 1993) paper gives a nice example for the asymmetry, when the predicted value is $150$ and the forecast is $100$, MAPE is $|\tfrac{150-100}{150}| = 33.33\%$, while when the predicted value is $100$ and the forecast is $150$ MAPE is $|\tfrac{100-150}{100}| = 50\%$ despite the fact that both forecasts are wrong by $50$ units!

What the above references, and the number of other sources, show is that if you use MAPE as a criterion for selecting your forecasts, this would lead to biased and underestimated results. Moreover you run into problems when the predicted value is equal to zero.

In the How to interpret error measures in Weka output? thread you can find a brief review of other error measures.

Tim
  • 108,699
  • 20
  • 212
  • 390
  • Is there a close enough catch-all measurement statistic that can be used for pretty much any scenario? By extension, he's pretty adamant on anything that has "percentage" in it, saying that it sinks in easier to people if we're given a percentage (and I agree with this). Thoughts? – ace_01S May 19 '17 at 09:10
  • 2
    @Ace_01S no there is not. The choice of error measure is always problem specific. – Tim May 19 '17 at 09:25
  • @Tim, can you elaborate more, please? Yes, both forecasts in your example are wrong by 50 units, by being wrong by 50 units is obviously a bigger error for a target of 100 vs. 150. Isn't it? – towi_parallelism Aug 06 '19 at 11:42
  • @towi_parallelism "bigger" in what sense? Both are wrong by the same quantity. – Tim Aug 06 '19 at 13:16
  • @Tim, yes, but being wrong by 50 when your actual value is 100 is a bigger error than being wrong by 50 when your actual value is 1 million. Isn't it? That's what is happening in the example too (actual values in your examples are different, hence the MAPE) – towi_parallelism Aug 06 '19 at 13:25
  • @towi_parallelism the point is that you get different penalty for under-estimated and different for over-estimates, in some cases this may be desirable, in other cases, not. Usually you want it to be simetric, i.e. punish over- and under- estimates the same, so not to have biased model. – Tim Aug 06 '19 at 13:52
  • I understand what you are trying to say, but all I'm saying is that the example in that famous paper is simply wrong! I have an actual value of `100`. My under-estimated value is `50` and over-estimated value is `150`, both wrong by `50` units from the same actual value. MAPE is gonna be `|(50-100)/100|` = `|(150-100)/100|` = `50%`, totally symmetric! – towi_parallelism Aug 06 '19 at 14:05
  • @towi_parallelism in your example yes, but in the example they gave they show other kind of asymmetry. – Tim Aug 06 '19 at 14:38
  • Unfortunately, I still cannot believe that. That should be the difference between scale-dependent and scale-independent metrics. 50 units is always 50 for MAE, but with a scale-independent metric, 50 units off for an actual value of 150 should exactly be 33.33% off, and 50 units off for the actual value of 100 should exactly be 50% off. I hope I can find a good scientific paper as the counter-argument for the (Makridakis, 1993) paper.. – towi_parallelism Aug 06 '19 at 15:15
  • @towi_parallelism counter-argument to what? This is simple arithmetic. – Tim Aug 06 '19 at 15:28
  • 1
    It is simple math, but results in two very different interpretations. You are saying treating 50 units off from 100 and 50 units off from 150 differently is an example of asymmetry caused by MAPE. I am saying the example is asymmetric in its nature. being 50 off from the baseline of 100 is totally different from being 50 off from the baseline of 150. Maybe we just need a better example – towi_parallelism Aug 06 '19 at 15:40
  • I agree, 50 units away from a target of 100 is very different from 50 units away from a target of 150. Same as with 50 units away from a target of 100 versus 50 units away from a target of 1,000,000. In my opinion, where this difference matters is when trying to compare the accuracy of different models on different time series, with different scales. This is, indeed, when MAPE doesn't make as much sense (I still wouldn't discount it though). – Vlad Jul 13 '20 at 07:37