Suppose you want to investigate the effect of several independent variables on an event taking place using the Cox PH model. You have some independent variables that change only with each new time step (inflation
and covid_lockdown
), and in this case, the event is defaulting on loans.
id time inflation covid_lockdown salary debt event
01 1 2% no 30k 100k 0
02 1 2% no 70k 50k 0
03 1 2% no 2000k 0k 0
01 2 8% yes 0k 110k 0
02 2 8% yes 75k 45k 0
03 2 8% yes 1500k 0k 0
01 3 6% yes 40k 100k 1
02 3 6% yes 80k 43k 0
03 3 6% yes 1200k 100k 0
Since inflation
and covid_lockdown
have zero variance across all individuals per time step, you cannot include them in the CPH model. However, we expect inflation to affect individuals differently. For example, an individual with a low salary is likely to suffer from high inflation, whereas someone with a high salary is unlikely to be affected. How can we include independent variables with zero variance in the model?
Specifically, I want to investigate the effects of inflation
and covid_lockdown
on defaulting on loans for different groups (eg., low salary vs high salary).