Since a two-way fixed effects model is equivalent to a DD estimator, I was wondering if running a three-way fixed effects model would be equivalent to the difference-in-difference-in-differences (DDD) estimator?
Technically, yes. But it isn't as simple as tossing in the fixed effects at each level and estimating a three-way interaction term. The second level interaction terms must remain.
Say you have three dimensions over which a treatment may vary: state $s$, year $t$, and age group $a$. Let's look at the difference-in-difference-in-differences estimator when the timing of the intervention is well-defined. Suppose state $A$ is exposed to a policy and state $B$ is not. In state $A$, the policy is intended to target individuals under the age of 35. The equation is as follows:
$$
y_{iast} = \beta_0 + \beta_1Treat_s + \beta_2Age_a + \beta_3Post_t + \beta_4 (Treat_s \times Post_t) + \beta_5 (Age_a \times Post_t) + \beta_6 (Treat_s \times Age_a) + \beta_7 (Treat_s \times Age_a \times Post_t) + u_{iast}
$$
where $Treat_s$ is equal to 1 for state $A$, 0 otherwise. $Age_a$ is equal to 1 if a person is under the age of 35, 0 otherwise. $Post_t$ is a standardized time indicator equal to 1 in all years after the policy goes into effect, 0 otherwise.
You may have seen the more general representation of this equation:
$$
y_{iast} = \gamma_{st} + \lambda_{at} + \eta_{as} + \delta D_{ast} + X_{iast}'\beta + u_{iast},
$$
which includes state-year effects, age-year effects, and age-state effects. Depending upon how you define the relevant variables, your model may drop some of the second level terms. In software, you could also include a concatenated version of state-year and/or age-year into the model. This may result in collinearity, though it shouldn't affect your estimate of $\delta$.
In practice, I would simply regress $y_{iast}$ on state-year interactions, age-year interactions, age-state interactions, and your policy dummy. The relevant state, age, and year fixed effects will be estimated for free. Note, $D_{ast}$ is your triple interaction term, just defined in a different way. Here, we instantiate the treatment dummy manually. Put differently, $D_{ast}$ is equal to 1 if it meets three conditions: (1) the state is in the treatment group, (2) the individual falls in the younger age category, and (3) it is a "post-treatment" time period. Manually coding this interaction is useful in settings where the "timing" of the policy isn't always well-defined.
In my opinion, I think it's a bit misleading to say this estimator parallels the two-way fixed effects estimator. The generalized difference-in-differences estimator regresses some outcome on unit fixed effects, time fixed effects, and a treatment dummy. Note: two-way fixed effects implies separate unit and time effects—not a unit-time effect. In keeping with my previous example, a single state-year effect is not appropriate in a difference-in-differences setting. In fact, in the absence of individual data within states, a state-year effect would chew up all your degrees of freedom. In short, it's insufficient to claim that a three-way fixed effects equation is a difference-in-difference-in-differences estimator. In settings where we triple difference, we must attempt estimation of all lower-order interaction terms.
Also, is it an issue if my age group variable is a dummy?
No problem at all.
In the first equation, $Age_a$ is equal to 1 if the individual is below a certain age threshold, 0 otherwise.