14

Can anybody tell me how to do recency, frequency & monetary value (RFM) modeling & customer value modeling in R?

Also, can somebody refer me some literature on it?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Beta
  • 5,784
  • 9
  • 33
  • 44
  • 1
    you can also look at [BTYD](http://cran.r-project.org/web/packages/BTYD/index.html) package in R. Or buy till you die package. I think Bruce Hardie is one of the authors. Not too sure though. –  Nov 09 '12 at 10:08

3 Answers3

11

As for references, Data Mining Using RFM Analysis should help as far as terminology and further references go.

One of the simplest (and popular) ways to model the probability of customer response is to use logistic regression with RFM as explanatory variables (among other available variables).

For modeling monetary value, one could just regress revenue on RFM directly (by using a simple linear model for starters) which usually does surprisingly well. More advanced/non-linear models (such as Random Forest or Gradient Boosting Machine) do better than linear models in my experience.

Another popular approach is to build a slightly more complex model for predicting monetary value based on two sub-models: one for probability of response (e.g. using logistic regression as a function of RFM), and the other for revenue conditional on response (again, it could be as simple as a linear model of RFM). Expected monetary value is the product of the two predictions.

If randomized test/control data are available then uplift/netlift based techniques are quite popular for modeling the incremental benefit of a treatment.

As for customer life cycle value, see Modeling Customer Lifetime Value for a review and further references.

With regards to modeling in R, I am not aware of any "off-the-shelf" packages for that type of modeling. R does provide all necessary building blocks for that though (unless you have enormous amount of data - in that case you may have to rely on more scalable tools)

Yevgeny
  • 1,422
  • 12
  • 11
  • 1
    Very nice answer, but I think the first link might be broken. – dimitriy Nov 09 '12 at 17:16
  • @Yevgeny, I have two questions regarding the suggestions you have given. First, as for modelling monetary value, is it ok to regress revenue using Monetary among the predictor variables? I'm afraid they will be quite the same variable. In the second place, do you have any online resources that could help me understand how to carry out linear regression conditional on the response (using the second approach you described)? Thank you very much! – nhern121 Jun 26 '13 at 20:20
  • 1) It is okay as long as you are not confusing the explanatory/input variables (from past data) and the target variable (from "future" data) 2) Just choose the subset of data where customers bought something and regress the revenue on the explanatory variables – Yevgeny Jul 02 '13 at 19:48
4

Not sure if you are still working on the RFM modeling. Here (pdf) is an article / the vignette for the BTYD package in R that might be helpful to you. The whole article is based on R and it has 3 different models to look at. On Page 1, 2.1 Data Preparation, you can see the context about RFMs.

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
sharp
  • 191
  • 2
  • 9
0

There is a new R package that implements some of the latest modeling techniques for CLV in non-contractual settings (e.g. retailing): https://cran.r-project.org/package=CLVTools

Here is a step-by-step walk-through based on data from an apparel retailer: https://www.clvtools.com/articles/CLVTools.html

majom
  • 872
  • 1
  • 12
  • 27