What is a good modelling approach for Customer Lifetime Value when consumption intensity matters?

Question

I would like to produce a customer livetime value (CLV) analysis where the CLV is tied to the intensity of the use of the product, i.e. it is not a subscription business (where the relevant event would be the cancellation of the subscription), but more something like an advertisment financed business where the company obtains revenue whenever the customer consumes something.

The dataset I have looks like this:

The dataset looks like this:

Customer Id	Time period	Consumption	Cumulative Consumption
1	1	3	3
1	2	4	7
1	.	.	.
1	60	0	1002
2	1	10	10
2	.	.	.
2	10	3	21
.	.	.	.
4	1	1	1
4	2	3	4

How can I estimate how much a customer is going to consume, depending on the time period I observe him in and on his consumption up to that period in time, until the end of his live? Or, put differently, what would be an appropriate modelling appraoch in this scenario?

What information do you have about the time until "the end of his life"? That would seem to be critical here. — EdM, Feb 13 '21 at 15:57
Hey, thanks for your reply. Not anything particularly clear. At best something like has been gone for n periods or so. — clog14, Feb 13 '21 at 15:59
This is basically how data in non-contractual setting would look like. There is an R package called CLVTools which implements probablistic approaches for such applications (e.g. Pareto/NBD model) and provides a step-by-step walk-through based on data from an apparel retailer: https://www.clvtools.com/articles/CLVTools.html. This "apparel retailer" dataset looks much like the data that you described in your question. — majom, Mar 08 '21 at 12:48

EdM · Accepted Answer · 2021-03-03T13:55:49.837

You could treat this as a recurrent-event situation, in which each individual "consumption" represents a separate event.* Or you could treat this as a count-based model, modeling the number of events in each observation-time period. The best choice depends on the nature of your data.

First, make sure that your data adequately represent the information you have. For example, I can't tell whether Customer 2 was last seen at time period 10, or if a longer period of time has elapsed but there hasn't been any consumption since time period 10. You data set needs to keep track of the total elapsed time for each individual, even times without consumption.

Then look hard at the patterns of cumulative consumption over time, which will tend to smooth out the variability among time periods. Do that for a large number of individual customers. What types of patterns do you see? Does cumulative consumption tend to plateau at long times? If so, then it might make sense to think about a finite "Customer Lifetime." Alternatively, does cumulative consumption tend to keep rising over time? In that case there might not be a well-defined "Customer Lifetime" from your perspective; it might make sense just to estimate a mean rate of activity instead.

The way to proceed thereafter depends on the activity patterns that you see. For a continually, albeit randomly, rising cumulative consumption, a Poisson or negative binomial model model might work for estimating rates, with customers treated as random effects. Each customer would then having a characteristic underlying activity rate, with a distribution of rates among customers. That's a pretty standard type of generalized linear model. You would model the counts per time period, potentially using time period itself as a predictor to see if rates are systematically changing over time.

If such a model fits your data adequately, then for a new customer you could try to estimate the rate from initial behavior.

If cumulative consumption does tend to plateau in time, you could use a recurrent-events survival model that takes into account the multiple customers. This type of model would also have to incorporate the censoring in time of your observations. For example, if it's been only 10 time periods since Customer 2 entered your data set, that Customer doesn't provide any information on customer behavior beyond 10 time periods. You don't know what her future consumption might be, you can't just assume there would be no further activity. Survival models take that into account.

If you have information about the customers besides their order histories, then your models could use them as covariates to potentially improve predictions for individuals.

In response to comments:

this (also) implies every customer in the training dataset has to be observed for the same n periods, right?

No. Individuals can be observed over different periods of time. You set separate time = 0 references for each individual, typically the time that the individual first entered your data set. Then, for initial assessment of your data, you plot cumulative event numbers for each individual as a function of time relative to that individual's starting time. Plots for some individuals will just go out to longer times than others.

Whether you model this from a consumption-rate perspective or a "Customer Lifetime Value" perspective, you can use whatever information you have. For example, if you are estimating rates per time period in a mixed model, you use information on the patients that you have for each time period. If you are modeling total counts, you can take the total observation time for an individual into account with an offset term in a regression model. A survival-analysis recurrent-events approach naturally takes the "censoring" at the last observation time for an individual into account.

in this recurrent-events survival models why does cumulative consumption have to plateau with time.

It doesn't. You need to find out whether that's the case before you can decide how to model. If cumulative consumption keeps on going up indefinitely for individuals, then there is no sign of a finite "customer lifetime" and you need to focus instead on the consumption rate and whether the rate has any patterns as a function of time. If there is a plateau in cumulative consumption, then there might be a finite "customer lifetime" that could be modeled in your wish to estimate a "Customer Lifetime Value."

I was having troubles from the literature I saw to understand how to incorporate this intensity dimension

The way you presented your situation, it seems that the "consumption" can be modeled as counts of events. For example, that could be modeling clicks on ads on a web site, with each click representing a unit of "consumption." Each "consumption" event is essentially the same, but an individual can have multiple such events within a time period.**

For a point process, the instantaneous underlying event rate is actually called the "intensity." From that perspective, count-based models inherently model intensity. How best to do that depends on the nature of your data: whether you should model different customers as having different but individually constant baseline intensities, or whether you need to model intensities as a function of time (including time as a predictor in your model, in a form suggested by your knowledge of the subject matter, or in a flexible form like a spline).

where you suggest the Poisson or negative binomial model - could you maybe cite a reference here where this is discussed in a (somewhat?) similar context?

Once you know the terms to search for, finding references becomes a bit easier. That can help whether you are analyzing this from a survival/recurrent events perspective or from a point-process/count-based perspective.

For identifying references that might help, you can think about each of your repeated "consumption" events as being analogous to repeated asthma attacks, repeated hospital admissions, etc., in the medical literature. Or you can think about count data over time that are less event-based, like counts of a type of cell in the blood of patients at successive clinical visits, or counts of RNA molecules of different types within individuals over time. The choice depends again on the nature of your data.

As noted in a revised part of the answer above, if you are modeling counts per time period you could have a fairly standard generalized linear mixed model based on an underlying Poisson or negative binomial process. The standard lme4 package in R provides tools for both. There is a wealth of information readily available about how to use those tools.

A DuckDuckGo search on recurrent event recently turned up many freely available reviews. Yadav et al. provide an "Overview"; Thomsen et al. illustrate approaches on a particular data set; Reliawiki has nice illustrations of cumulative event plots; Amorim and Cai provide a tutorial emphasizing epidemiology; Rogers has a nice overview in a slide deck.

A search on negative binomial point process mixed model covers many aspects of your situation from the point-process/count perspective. The mixed model term allows for taking differences among individuals into account efficiently. The negative binomial term allows for the variance of counts to be something other than equal to the number of counts that is required by a Poisson model, something that often is needed in practice. That search turned up a paper on modeling CD4 cell counts over time in patients, one on modeling tree regrowth following fires, and one on RNA-seq counts over time in individuals.

*For a recurrent-event survival-type approach, it might be simpler to use the times of individual events rather than grouping events into time-period bins as are displayed here.

*If the natures of events can differ, then you have a multi-state recurrent events scenario. If individual events have different magnitudes of "consumption," then I think the things get more complicated if you can't easily fit them into a multi-state model (e.g., into "small," "medium," and "large" events). There is an R package PtProcess that's used for seismology, a field in which point processes differ continuously in magnitude, and might be useful (although I have no experience with it).

Hey, many thanks! A few quick questions: in the first sentence - this (also) implies every customer in the training dataset has to be observed for the same n periods, right? — clog14, Mar 01 '21 at 07:29
Regarding the third (longer) paragraph where you suggest the Poisson or negative binomial model - could you maybe cite a reference here where this is discussed in a (somewhat?) similar context? — clog14, Mar 01 '21 at 07:32
Then another, maybe really dump question, in this recurrent-events survival models why does cumulative consumption have to plateau with time. Is it correct that this means, intuitively, that "agents" are really dying? — clog14, Mar 01 '21 at 07:34
Lastly, do you maybe have a literature reference for the recurrent-events surival model applicable here? I feel like I found these models, but I was having troubles from the literature I saw to understand how to incoroprate this intensity dimension — clog14, Mar 01 '21 at 07:36
@clog14 I revised the original answer a bit to distinguish count-based approaches more clearly from survival-based approaches. Click on the "edited" link at the end of the question to show the changes. I directly addressed your comments in an extension of the answer. — EdM, Mar 03 '21 at 13:57
amazing. Thank you so much. I will take some time now to fully digest this. — clog14, Mar 04 '21 at 14:38

score 0 · Answer 2 · answered Nov 20 '21 at 14:26

Perhaps you can start by considering the mathematical formulation of CLV show below. (NOTE: That is taken from this paper, which focuses on mobile gaming data, but the techniques are industry agnostic).

If your main modeling focus factor in the consumption intensity, then essentially you can use the formula below by:

Determining the distribution of M_n. In your dataset that would be "Consumption".
Fitting a suitable model for the time in between two consecutive consumptions.

The Step 2 would be the most interesting. And for that I'd recommend you check the section III.B of this paper for relevant visualizations tools and references. In particular, the references [13] and [18] there could be a good starting point for your problem.

What is a good modelling approach for Customer Lifetime Value when consumption intensity matters?

2 Answers2