I'm looking to forecast (impute) missing discrete values in multiple time series in order to reach a target volume in a consolidated time serie.
The context:
I have salesmen that are selling contracts in some markets over time.
For some markets, I have gaps with missing data.
ex below: the volume of contracts over time (days) for a given market
I'm pretty confident to fill gaps on this market at the consolidated market level and obtain something like below. (Here I've just print a very correlated market, and there are plenty other)
The problem:
I'm now looking for a method able to estimate the discrete volume of contracts over days made by and for each possible salesmen on gaps period in order to reach the volume of the market I will estimate.
Another aspect to consider, is the number of salesmen on this market: roughly 5000.
As a result, many of them are "opportunist" salers with very sparse discrete activity.
Here is a tiny sample of different salers code activity for the "most productive" (yeah...).
In my context, Is there a method (several?) in order to reach my goal?
e.g: Impute the most probable salers discrete activity according to:
- an estimated market volume over time which integrate seasonality and act as a target (the sum of all salers activity has to be equal to this target over time)
- the known activity of the salers around the gap ("past activity" before, and "future activity" after the gap)
I'm thinking and have read some things about Hidden Markov Chains, Kalman filters, ARIMA models, LSTM, Poisson regression, but I'm not an expert, and I don't really know how many methods I have to use and above all, how I have to combine them.
How would you proceed?
Thanks, Mathieu