2

I have longitudinal transaction data of a retail store where each row is a transaction done by an individual. I would like to perform a survival analysis to analyse how long a customer will transact before churning. For CoxPH model, it requires tenure and churn variable among other variables. I am not sure what is the correct approach to prepare these variables for this analysis. How to label whether a person has churned other than having a threshold of 45days or 2 months to decide it. Also, what is the correct way to represent tenure.

Below is the sample of data

Id.    Visit_date.       Amount.   Tenure     Churn     Age     Income
1.     04/03/2020        500        ?           ?       40      56K
1.     05/03/2020        300        ?           ?       32      60K
1.     05/23/2020        800        ?           ?       28      90K
1.     07/04/2020        700        ?           ?       40      56K
2.     02/03/2020        500        ?           ?       43      50K
2.     05/12/2020        300        ?           ?       60      90K
3.     03/23/2020        800        ?           ?       18      80K
4.     07/04/2020        700        ?           ?       20      40K

```
krishna koti
  • 131
  • 1

1 Answers1

2

It is your decision on what you define as churn. This is something to discuss with business people who are going to use those results. The definitions will vary on case-by-case basis. Usually what is used is some threshold that is meaningful from business perspective.

Tim
  • 108,699
  • 20
  • 212
  • 390
  • thanks for the response. Also, suppose a threshold is used to define the Churn, then how should i setup **tenure** variable in case of this longitudinal data considering the above sample data where user_id 1 has made four transactions. – krishna koti Jul 31 '20 at 21:39