11

I have two endogenous variables $x_1$ and $x_2$ and am trying to estimate the following model:

$$y = \theta_0 + \theta_1 x_1 + \theta_2 x_2 + \theta_{12} x_{12}$$

where $x_{12} = x_1\times x_2$. I'm particularly interested in the interaction term $\theta_{12}$. I also have two variables $z_1$ and $z_2$ that are valid instruments for $x_1$ and $x_2$, thus $z_{12} = z_1\times z_2$ is a valid instrument for $x_{12}$. I know that models with more than one endogenous variable are difficult to interpret.

If I were to use 2SLS, would I need to regress $x_{12}$ against the entire set of IVs $z = \left\{z_1, z_2, z_1\times z_2\right\}$? Also, what is the risk in regressing $y$ against $\hat{x}_1$, $\hat{x}_2$ and $\hat{x}_1\times \hat{x}_2$ (instead of $\hat{x}_{12})$?

In the end, I would like to work with summary statistics only, for example by first regressing $x_1$, $x_2$, $x_{12}$ and $y$ on each instrument and then using inverse-variance weighting—like in the case of Mendelian randomization—to estimate $\theta_{12}$. Is the IVW estimate the same as the 2SLS estimate when an interaction term is present in the model?

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
Biblot
  • 161
  • 6
  • 2
    The answer to your questions are likely to depend on the context. For example, in what way is $x_{12}$ endogenous conditioning on $x_1$ and $x_2$? Even though $x_1$ and $x_2$ may be endogenous, their interaction may be exogenous when conditioning on them individually. If all are endogenous, then your choice of instruments will depend on how they relate to each other. Maybe the figures in https://arxiv.org/ftp/arxiv/papers/1301/1301.0560.pdf is of use. – Elias Dec 19 '19 at 08:53

0 Answers0