The difference-in-difference-in-differences (DDD) estimator you're referencing is typically used in settings where a law (i.e., treatment) is enacted in some non-uniform (i.e., staggered) manner across jurisdictions (e.g., states). In practice, you may have a mixture of early- and late-adopter states, and another subset of states that never adopt the new law in all time periods under observation. States rarely impose new legislation at the same time, and evaluators typically wish to exploit this variation in treatment timing. We may also suspect this new legislation affects sub-groups differently within those treated states. This is yet another layer of variation we can exploit, assuming we actually observe individuals over time within each state.
Suppose during your evaluation of the effect of anti-corruption laws on wages in the United States you acquire detailed employment records for individuals nested within states. Now suppose you suspect the law has a differential impact by age group or gender. Say, for example, you age stratify workers within each state. For simplicity, a theory suggests the law affects the earnings of younger employees differently than older employees. You dichotomize workers accordingly, with all employees under 35 years of age falling into the younger age category. Well, the new law now varies over three dimensions: age group $a$, state $s$, and year $t$. It's important to be aware that because the laws are introduced at different times, a standard three-way interaction term isn't going to work. We must define the law dummy to account for the staggered adoption periods.
The more general representation of the DDD equation is as follows:
$$
Y_{iast} = \gamma_{st} + \lambda_{at} + \eta_{as} + \delta L_{ast} + u_{iast},
$$
where $Y_{iast}$, which denotes the earnings of individuals in age group $a$ in state $s$ and year $t$, is regressed on a full set of state $\times$ year effects (i.e., $\gamma_{st}$), age $\times$ year effects (i.e., $\lambda_{at}$), age $\times$ state effects (i.e., $\eta_{as}$), and a law dummy (i.e., $L_{ast}$). It's also permissible to include a concatenated version of, say, state-year and then letting software 'dummy out' all the relevant state-by-year effects for you. Software will invariably drop some of the second level terms to break the collinearity, but it shouldn't affect your estimate of $\delta$. I should stress that you must include all of the second level interaction terms or your model may be misspecified.
Note, the principal variable of interest, $L_{ast}$, 'turns on' (i.e., switches from 0 to 1) in those $a$-$s$-$t$ combinations where the law is in place. It is your triple interaction term just defined in a different way. According to your other post, it appears there is no well-defined period delineating pre-/post-treatment, so we must define the law indicator in a way that it captures the staggered onset of treatment across states, which invariably includes the subset of individuals more sensitive to the anti-collusion legislation (i.e., younger employees). Put differently, imagine the law dummy is a column of $0$'s. As you work your way down the rows, assign the observation a value of $1$ if the individual in your sample is under 35 years of age and nested within a treated state and is in a year after the anti-corruption law went into effect.
In the paper you referenced, it appears the author is clustering by region and estimating all second-order interaction terms. You could call these terms "fixed effects" but it's a bit misleading, because it assumes you only need a fixed effect at the individual, unit, and/or time level. But this is not the case. You should also attempt estimation of all state-year, age-year, and age-state effects to preserve the hierarchy.