Define prediction interval for a monthly sales pattern

Question

The point of my analysis is to develop an alert system to detect when the sales of the current month are deviating significantly from the monthly forecast and guess whether the month is going to close short/long. My idea was to obtain prediction intervals for the "shape" of the sales curve throughout the month and then compare the current forecast execution rate to those intervals. I've implemented my concept in r with dummy data:

p <- 30 # 30 month days
n <- 12 # 12 month history

data <- matrix(rnorm(p*n, mean = 100, sd = 75), p, n) #dummy data
data <- apply(data, 2, cumsum) #cumulative sales
data <- data %*% diag(1/data[p,]) #cumulative percentage or "execution rate"

mean <- apply(data, 1, mean)
var <- apply(data, 1, var)

sup <- mean + qt(0.975, n-1)*sqrt(var*(1+1/n)) # t-student PI
inf <- mean - qt(0.975, n-1)*sqrt(var*(1+1/n))

matplot(data, type = "l", col = "gray", lty = 1, main="Prediction intervals for monthly sales pattern",
        xlab="day of month", ylab="execution rate") 
lines(mean, col = "red")
lines(sup, col = "blue") 
lines(inf, col = "blue")
abline(0, 1/31)
legend(1, 1, legend=c("mean", "95% PI", "data", "linear"),
       col=c("red", "blue", "gray", "black"), lty=1, cex=0.8)

Is this correct from the statistical point of view?

Why reinvent the wheel? If you had used a time series model you would have gotten a CI for free. — user2974951, Oct 25 '18 at 11:08
The software I'm currently working with only forecasts at the montly level, not daily. However, at any point, I can access the sales of the current month. — Pedro Schuller, Oct 25 '18 at 11:13
You are probably looking for a [tag:prediction-interval], not a [tag:confidence-interval]. There is [a difference](https://stats.stackexchange.com/tags/prediction-interval/info). — Stephan Kolassa, Oct 25 '18 at 13:24
If you wish to post your daily sales data , starting date and country of origin , I will try and completely answer your question or if you wish review https://stats.stackexchange.com/questions/313810/simple-method-of-forecasting-number-of-guests-given-current-and-historical-data/313852#313852 and we can discuss this example in a chat session or offline if you wish. You can reach me at my posted email address. — IrishStat, Oct 09 '19 at 20:22

score 1 · Answer 1 · answered Oct 25 '18 at 11:43

1

No. You can't use a CI to detect when normal observations have deviated significantly from the expected trend. This is because the width of the CI approaches 0 as you get more data. Even a prediction interval won't quite solve the problem, since the chances of a false positive declaration of "significant deviation" increase every time you inspect the curve.

The Y-axis doesn't make sense. Is 100% the number of contracts which execute each month? Why would you standardize to that value? If you have to know how many contracts you get in a month to calculate the percentage, then it defeats the purpose of monitoring the ongoing progress.

The tool you are probably looking for is a control chart.

answered Oct 25 '18 at 11:43

AdamO

52,330
5
104
209

100% is indeed the number of contracts which execute each month. However, when monitoring the current month, the calculated percentage will be the current sales over the total forecast for that month. For example, monthly forecast is 100. I'm at day 15 and I have sold 30 units. Do I have confidence to believe my month will close short? I might if my sales usually follow a linear pattern throughout the month, but I might not if usually most contracts are closed at the end of the month. – Pedro Schuller Oct 25 '18 at 12:31
@PedroSchuller I think you are abusing the terminology of "confidence". A lot of things will appear "linear" if you arbitrarily turn them into a brownian bridge. You have no idea at day 15 if 30 units..or contracts...or sales (what *is* your application here?) is all you will sell that month. You are using a crystal ball to peer into the future to make these a percentage, it's not surprising to see a "linear" pattern. If I can't convince you that that chart is not useful, try to go and use it. – AdamO Oct 25 '18 at 13:17
Forgive-me if I'm not using the correct terminology, I'm just trying to make the problem clear. But I'm sure there is a way to use past data to have some notion of how unit sales evolve during a month, and whether the current month's sales are deviating significantly from that pattern. Perhaps my approach is altogether inadequate, but I'd love to hear other ideas. – Pedro Schuller Oct 25 '18 at 13:43

score 0 · Answer 2 · answered Oct 25 '18 at 14:51

We routinely build daily models taking into account day-of-the-week-effects , day-pf-the-month-effects . week-of-the-month effefcts , monthly effects , holiday effects and user-specified predictor series ( if any ) . The whole idea is to compute the probability of making a month-end number as we go through the month.

This is accomplished by creating a probability distribution for each step ahead ( each day ) using error re-sampling (bootstrapping) .The k distributions containing l simulations per period ( k being the # of remaining days) can then be aggregated simulation # by simulation # to obtain a distribution of the sum.

These simulations can optionally be "infected" by identified pulse anomalies in order to provide realistic limits yielding informative probabilities.

Define prediction interval for a monthly sales pattern

2 Answers2