I'd like to build a predictive model for predicting churn for a website.
Here is the information I have for each customer :
What they did :
- visit the website
- Buy something
- Do not read thenewsletter
- Read thenewsletter
and when :
- the date of the action
For instance for a customer X :
- 90 days ago : visit the website
- 50 days ago : do not read the newsetter
- 20 days ago : do not read the newsletter
The goal is to predict if a customer is gonna be inactive based on this rule : If the customer did not visit, buy or read our email on the last 12 months, he's lost.
For now, I'm gonna use this data structure (example with the customer X) :
User_id | Last_action_1 | Last_action_2 |Last_action_3 |
X Do not read Do not read Visit
The problem is that I lost the time information, I just retain the order.
What is it recommanded to keep it ?
I think about these ideas, but I think there is better practices :
User_id | Last_action_1 |Last_action_1_date | Last_action_2 |Last_action_2_date| Last_action_3 |Last_action_3date
X Do not read 20 days ago Do not read 50 days ago Visit 90 days ago