0

I have two datasets, both are .csv files:

  1. Forecast- Marketing team's forecast of inventory levels that would be required in 2019
  2. Inventory - Factory's records of actual inventory values recorded in year 2019

I've cleaned the files in R, so now they are both formatted the exact same way, both with same amount of rows& columns, same column headers, with material IDs sorted alphabetically so both files match row to row for the Product ID.

An example table to show the format: Table

I want to compare these two files using statistical tests, and find out if the forecast is significantly different from actual inventory levels.

I am familiar with Z-test, t-test, ANOVA but I've only used them for scientific data. Are they are applicable to find out how significantly different a forecast is from inventory?

If not, what are some other tests I can use?

I plan to use Minitab or R. I am also open to using Python. I'm open to using other software anyone wants to recommend but it will take time for me to learn them so I prefer using ones I am familiar with.

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
E S
  • 3
  • 1
  • 1
    Hm. Why do you want statistical tests? – Stephan Kolassa Apr 07 '20 at 15:48
  • To assess time-series forecasts, we usually use measures of accuracy rather than test statistics. You can get a quick overview of some options from a master [here](https://pdfs.semanticscholar.org/af71/3d815a7caba8dff7248ecea05a5956b2a487.pdf). – ulfelder Apr 07 '20 at 15:54
  • 1
    @ulfelder, while accuracy measures are popular, tests have their role, too. The [Diebold-Mariano test](https://stats.stackexchange.com/tags/diebold-mariano/info) is among the best know, but there are numerous other tests. See e.g. Diebold ["Forecasting in Economics, Business, Finance and Beyond"](https://www.sas.upenn.edu/~fdiebold/Teaching221/Forecasting.pdf) Chapter 10. – Richard Hardy Apr 07 '20 at 16:21
  • @RichardHardy, but aren't those usually for comparing two forecasts of the same series? From the table posted, it looks like the question is about assessing the accuracy of (single) forecasts across a number of series. – ulfelder Apr 07 '20 at 17:22
  • @ulfelder, you are right, most of them are for forecast comparisons. – Richard Hardy Apr 07 '20 at 17:47
  • Thanks to everyone who answered! I'm actually still just an undergrad student so definitely not well-versed in what methods to use to analyse different types of data, so I really appreciate all the help! @RichardHardy – E S Apr 08 '20 at 11:25
  • Thank you @ulfelder as well, it won't let me tag two users in one comment! :) – E S Apr 08 '20 at 11:27

1 Answers1

1

As the comments point out, it is very uncommon to test whether forecasts are statistically significantly different from the actuals. I have been forecasting for 14 years now, both academically and practically, and I have never seen this.

If you really want to do this, you could calculate errors as $y_t-\hat{y}_t$ and analyze these. I would start by collecting these over time, so you get a time series, and then seeing whether there are any time series dynamics in there. I would expect so: if sales were lower than expected in one period, then inventory piles up, and it may take a while to sell the inventory off, so you might have a high inventory for multiple periods. Or if your business is seasonal, it might make sense to increase inventories in certain periods, which may lead to seasonalities in your forecast errors.

If you believe your errors have no dynamics, you could apply a standard t test. Or a multilevel model that includes the SKU as a grouping factor.

This page explains common ways of assessing forecast accuracy (Ulfelder linked to a PDF version, but the entire online textbook is very much worth reading). Note that some error measures may incentivize you to bias your forecast, e.g., the MAPE or the MAE. You can then start thinking about whether your forecasts are "good enough". This may be helpful.

Finally, you may want to think about whether your errors are actually meaningful. You are comparing forecasts of required inventory levels against actual inventory levels. Where do safety amounts come in? Did Marketing already include safety amounts in their forecasts, or were their forecasts expectation forecasts, so Operations added safety amounts on top? (Or did both teams add safety amounts because of miscommunications?) How do you account for logistical rounding or batch sizes? Does it actually make sense for Marketing to forecast inventory, and shouldn't they rather forecast sales (which I believe is far more common)?

Stephan Kolassa
  • 95,027
  • 13
  • 197
  • 357
  • Wow, 14 years! I'm still an undergrad with a *very* basic understanding of statistics & data analysis. I really appreciate how clearly you explained this to me! This is actually a small part of my final year project report. Since the level of depth expected from me is not immense, a comparison of MSE, MAPE, MAE as accuracy measures seem most appropriate for me. I'm not confident if I can make a multilevel model based on SKU but I will definitely try it if time allows. For now, I will read up on the links you provided & consider the questions you posed in analysis. Thank you! – E S Apr 08 '20 at 11:43
  • @ES, testing whether forecasts are different from actuals only makes sense if one assumes some sort of measurement error. Otherwise a single discrepancy between a forecast and the corresponding actual is enough to refute the null hypothesis of equality. This reminds me of testing for multicollinearity (which is nonsense, as shown by Dave Giles in his blog post ["Can You Actually TEST for Multicollinearity?"](https://davegiles.blogspot.com/2013/06/can-you-actually-test-for.html)). (On a second reading of that, it is not quite the same, but worth knowing anyway.) – Richard Hardy Apr 08 '20 at 11:45
  • Ok that makes sense. I had the same doubts because but I just wanted to be certain because I'm not well-versed in methods on analysing forecasts, it's something that was introduced very recently to me. Thanks for the link, will check it out – E S Apr 08 '20 at 12:21
  • @RichardHardy: I wouldn't go as far as that. I'd say it would certainly make sense to test whether a forecast is right *on average*, i.e., that it is unbiased. In a non-time-series setting, we would use a one-sample $t$ test. We can't do that for time series because of the time dynamics. But the question is a valid one. – Stephan Kolassa Apr 08 '20 at 12:32
  • I agree to some extent, but that is a different question from the one OP posed, unless I am misreading the OP. And here is why I said "to some extent": in most cases I can imagine we know very well that we are not exactly right on average. This is a sharp null hypothesis, and even before testing it we know it is wrong. Thus the test result really informs us about our sample size (is it large enough to detect a violation we know is there) rather than about the correctness of the null. But if we look at more than just the decision itself, i.e. effect size, then it can be helpful. – Richard Hardy Apr 08 '20 at 13:26