0

I have a data frame "customers" build of customer id, month and total purchases that month. I have calculated a running slope of total purchases (window of 12 months) for each customer. The thing is, there are customers that the trend of the slope contradicts business logic. Consider the following vector of purchases - a customer doesn't buy anything for the first 11 months and then buys, 100, 50 and finally 4

c(0,0,0,0,0,0,0,0,0,0,0,100,50,4)

meaning that if t=present time, these will be the vectors for rlm and their respective slopes:

t = c(0,0,0,0,0,0,0,0,0,100,50,4)

rlm_slope = 0.4541076

t_minus1 = c(0,0,0,0,0,0,0,0,0,0,100,50)

rlm_slope = 0.4478227

t_minus2 = c(0,0,0,0,0,0,0,0,0,0,0,100)

rlm_slope = 0.0003052124

So the slope increases though from business POV customer's performance deteriorates. How can I go around this? Is there a general solution that will include other customers?

Alex
  • 1
  • You should use a kind of hurdle model. You need to model if a customer purchases something at all and only if they do you can model how much they purchased. – Roland Aug 11 '20 at 10:23

1 Answers1

0

This is because your regression line has to fit all the initial 0 values as well. If you want to forecast short-term trends (respond to the values changing over the last few months) you may be better off by using something like Holt’s method.

  • The thing is I don't want to forecast, I just want to say something about customer's trend - whether he's deteriorating, improving or stable.What if I used weights for the residuals? Would that help? Is there a rule for what that weights should be? – Alex Aug 11 '20 at 08:44