2

Let's say we have data on the number of clicks per user over quite a long period of time. We can use, say Facebook Prophet, to forecast daily values given that we have enough historical data. That will be the total number of clicks for all users.

Additionally, we have several ways to segment users like operating system, country, acquisition channel (emails, ads, etc). This makes the structure not strictly hierarchical but rather grouped.

Having read a few articles on approaching this task, it seems that the three most common approaches are predicting on the most granular level, predicting on the least granular level, or setting a cutoff and trying to combine the former two.

Given that there might be around 200 countries, it feels like it's a crazy idea to have a model for each possible combination. Using weights/ratios from historical data sounds plausible for few levels of disaggregation, but then I guess the accuracy would drop tremendously once the depth of disaggregation rises.

Can you please provide some sort of guidance regarding this class of problems? There doesn't seem to be plenty of information on grouped time-series forecasting.

My Name
  • 131
  • 7
Don Draper
  • 86
  • 12
  • 2
    See https://otexts.com/fpp3/hierarchical.html – Rob Hyndman May 17 '21 at 08:35
  • @RobHyndman, thank you very much. I didn't even expect to get an answer from the author of this book where I learned about time-series forecasting in the first place. I have reread the chapter on hierarchical and grouped time series, but it still seems that without using heuristics like weights, there's no way apart from having hundreds of models. I know that there are several reconciliation approaches (including some modern ones like MinT), but that only covers the part where we want the predictions to sum up to the total. Anyway, we need to have individual forecasts for each level – Don Draper May 17 '21 at 10:00
  • Is your most granular level super intermittent or is it just the scale of the problem the issue with just forecasting at that lowest level and adding it up to your 'groupings'? If scale is the issue then are you planning on using prophet? It is quite expensive to compute. – Tylerr May 21 '21 at 20:27
  • Another angle to approach the problem is something like constrained factor models (https://www.jstor.org/stable/27920189) since you can specify the $H$ matrix to take into account the different groupings of the time series. – David Veitch May 25 '21 at 16:18

0 Answers0