I have a question about DDD (triple difference) with multiple time periods. There are several useful answers regarding DD with multiple time periods and this question (3 related questions about DDD (TD, triple-diff) estimators) regarding DDD with multiple time periods, but I am still confused about how to model this. Suppose I want to study the differential impact of a policy (implemented at different times across different states and not implemented at all in some states) on old vs young people:
treatment vs control
pre and post policy (with the policy being implemented at different times in different states)
old vs young
If the policy dates were the same, I could just run a normal DDD:
$y_{ist}=\beta_0+\beta_1 post + \beta_2 trt_s + \beta_3 old_i +\beta_4 post \times trt + \beta_5 post \times old_i +\beta_6 trt_s \times old_i + \beta_7 post \times trt \times old_i$
where $post$, $trt$ and $old$ are dummies for post, treatment and old, respectively and $\beta_7$ is my parameter of interest. However, if the policy is implemented at different times, the variable $post$ is not well-defined. I tried running something like this:
$y_{ist}=\alpha_0+ \alpha_1 old_i +\alpha_2 policy_{st} + \alpha_3 policy_{st} \times old_i$
where $policy$ takes the value of 1 if state s is treated AND t is after the implementation of the policy in the state s. However, this is not really satisfactory, as I am not controlling for potential level differences in treatment vs control states. From my understanding, Wooldridge is suggesting "In a DDD analysis, a full set of dummies is included for each of the two kinds of groups and all time periods, as well as all pairwise interactions. Then, a policy dummy (or sometimes a continuous policy variable) measures the effect of the policy.", which I find very confusing. I take this to mean:
$y_{ist}=\gamma_0+ \sum_t \eta_t year_t + \gamma_2 old_i +\gamma_3 trt_s + \gamma_4 old_i \times trt_s +\sum_t \alpha_t (year_t \times old_i) +\sum_t \beta_t (year_t \times trt_s)+ \gamma_5 policy_{st} \times old_i $
The issue I have with this model is that by interacting the treatment and the year dummies, I am basically "killing off" most of the variation I hope to capture with the $\gamma_5$. I am not sure I understand the model correctly. How would you model a DDD in this context? (Apologies for the long question, I just wanted to be as clear as possible)