5

I know that in general a structural causal model (SCM) can be written in terms of structural equations. And in a more qualitative but formal manner, we can rewrite a structural model in terms of DAG.

Now suppose we have a simple structural equation like this:

$$y = \beta_0 + \beta_1x_1 + \beta_2 x_2 + .. + \beta_n x_n + e$$

where $e$ is completly exogenous. We don't know anything else about the model.

How can we write this model in a DAG?

EDIT Moreover I have some sub-questions:

1) the SCM above imply that $E[y|do(x_1,...,x_n)] = \beta_0 + \beta_1x_1 + .. + \beta_n x_n$ , now is true that $E[y|do(x_1,...,x_n)] = E[y|x_1,...,x_n]$ regardless the causal nexus among $x$s?

2) if we known only a subsample of dependent/causal variables like $x_1,...x_k$ with $k<n$ then we have a problem that sound like omitted variables. Now exist a way for find the others variables ($x_{k+1},...,x_n$)?

2a)If it exist, the causal nexus among $x$s become relevant?

2b)If it not exist, is still possible to identify the causal parameters $\beta_1,...,\beta_k$?

markowitz
  • 3,964
  • 1
  • 13
  • 28
  • "As become this model in DAG form ?" This isn't standard English, so I'm not sure what you're asking. DAGs are nonparametric, but your equation is a specific parameterization, so if you were to draw a DAG for it, there would be many parameterizations consistent with the DAG. Also, it's not clear what the dependence relations are among the Xs; do they share common causes? Do they have a causal ordering? – Noah Jul 10 '19 at 19:53
  • Firstly I'm sorry for my english. However the equation that I want to translate in DAG form is exactly the above one. Moreover, in general, for any DAG structure there are many parametric equation forms but for any parametric equations there is onli one DAG structure. It's correct? – markowitz Jul 11 '19 at 09:30
  • About the relations among the Xs the problem is exactly that we don't know much. At maxim we can say that all Xs are causes, at least potentially, for y and that Xs are, at least potentially, correlated each others. Clear causal nexus among the Xs are absolutely don't known. – markowitz Jul 11 '19 at 09:36
  • A simple DAG with arrows pointing from the $x$s to $y$ is all you could draw, but that wouldn't be any more descriptive than saying "all the $x$s cause $y$". A DAG provides less information than a parametric structural causal model. Furthermore, without any assumptions about the causal relations among the $x$s, this DAG couldn't be used for any causal analysis (i.e., to identify confounders, colliders, or instruments). – Noah Jul 12 '19 at 16:46
  • I feared something like that. However is a matter of fact that sometimes econometric models tried to achieve causal conclusions with an econometric theory that, in causal term, go not beyond the above conditions. – markowitz Jul 16 '19 at 20:56
  • @markowitz I tried to improve the english of your question, see what you think. Also, see my answer below. In short, if your only assumption is the structural equation of $Y$ plus exogeneity, you specify a class of DAGs, not a single DAG. – Carlos Cinelli Aug 13 '19 at 02:37

1 Answers1

7

Your model statement specifies a class of DAGs, not a single DAG. That is, all DAGs in which $x_1, \dots, x_n$ are direct causes of $y$, and $e$ is exogenous are DAGs compatibles with your assumptions.

For instance, for simplicity, say we have only $x_1$ and $x_2$. Then, among several other alternatives, the following DAGs would be compatible with your model specification:

enter image description here enter image description here enter image description here

But the following DAG would not be compatible (since the error term of $Y$ is correlated with the error term with $x_2$, but note in this DAG the causal effect of $x_1$ is still identified):

enter image description here

Carlos Cinelli
  • 10,500
  • 5
  • 42
  • 77
  • Do bidirectional dashed arrows represent related structural errors in DAGs? – markowitz Aug 13 '19 at 11:54
  • @markowitz exactly – Carlos Cinelli Aug 13 '19 at 20:24
  • Probably In this framework full understanding of the backdoor criterion is necessary (I'm studing). However It seems me correct to say that if we collect the data and regress $y$ on $x_1$ and $x_2$ in neither of the three case above we achieve both the correct causal effect of $x_1$ then $x_2$ on $y$. Its right? – markowitz Aug 26 '19 at 12:44
  • Exist a DAG specification for which a simple regression $y = \beta_1 x_1 + \beta_2 x_2 + e$ return parameters where both have causal meaning ? – markowitz Sep 04 '19 at 12:05
  • 1
    @markowitz in the three first models, all of them would return parameters with causal meaning-- the controlled *direct* effects, $E[y|do(x_1, x_2)] = \beta_1x_1 +\beta_2x_2 $. – Carlos Cinelli Sep 05 '19 at 06:26
  • but my question is: $E[y|do(x_1,x_2)] = E[y|x_1,x_2]$ in the three case above ? It seems me no. While we have $E[y|x_1,x_2] = \theta_1x_1 + \theta_2 x_2$ where, maybe, $\theta_1 = \beta_1$ and $\theta_2$ not is equal to $\beta_2$. It's correct? – markowitz Sep 05 '19 at 08:09
  • @markowitz Yes, in the first three models $E[Y|do(x_1, x_2)] = E[Y|x_1, x_2]$. In the fourth model we only have that $E[Y|do(x_1), x_2] = E[Y|x_1, x_2]$ but we cannot recover the causal effect of $x_2$. – Carlos Cinelli Sep 05 '19 at 21:03
  • I’m a bit confused. Put aside the fourth case that are not consistent with assumptions. The goal is to identify causal parameters from multiple regression given the assumptions above. Initially I have understood that the SCM likes the above is too poor of information for proper causal inference, because the causal nexus among $x$s are relevant (read comments also). – markowitz Sep 06 '19 at 12:26
  • Now you tell me that in all the three case above, in which the causal nexus between $x$s are explicitly given via DAG, we have that a regression $E[y|x_1,x_2]$ give always coefficients with causal meaning regardless the DAG (nexus). Causal nexus among $x$s are relevant or not? If so, the three case above are special? – markowitz Sep 06 '19 at 12:26
  • Hi @markowitz all three cases above are indeed special, they have the error term completely exogenous from all regressors. – Carlos Cinelli Sep 06 '19 at 22:22
  • Hi @Carlos but completely exogeneity is an assumption no special case. Moreover I intend exogeneity in standard econometric form like $E[e|X_i]=0$ with $X_i$ vector ($n$ X $1$). You mean that regardless the dimensionality and the causal nexus among the $x$s, in a class of SCM specified in the question: $E[y|x_1,...,x_n] = E[y|do(x_1,...,x_n)]$ is always true? – markowitz Sep 07 '19 at 08:16
  • Hi @markowitz yes, and the proof is simple. We have that $E[Y|do(x_1, \dots, x_n)] = \beta_1 x_1 + \dots + \beta_n x_n$ and that $E[e|x_1, \dots, x_n] = 0$. Thus, $E[Y|x_1, \dots, x_n] = \beta_1 x_1 + \dots + \beta_n x_n + E[e|x_1, \dots, x_n] = \beta_1 x_1 + \dots + \beta_n x_n = E[Y|do(x_1, \dots, x_n)]$. – Carlos Cinelli Sep 08 '19 at 01:18
  • @markowitz An "assumption" is a special case. When you say the error term is completely exogenous, you are assuming away any identification problem and positing that your model is one of special cases of models where all direct effects are identified (such as models 1 to 3 above). – Carlos Cinelli Sep 08 '19 at 01:20
  • Let me modify/improve the question so you can reply in the answer – markowitz Sep 08 '19 at 07:53
  • Then the answer to the first sub-question is affirmative. Therefore let me note that if we are confident that the set of $x$s is exact, exhaustive causal inference is achievable with basic multiple regression. There are no "adjustment" problems. – markowitz Sep 17 '19 at 17:23
  • Let me connect with the comments here https://stats.stackexchange.com/questions/188557/omitted-variable-bias-which-predictors-do-i-need-to-include-and-why/312138?noredirect=1#comment798666_312138 – markowitz Sep 21 '19 at 10:14
  • I have others doubt. 1) the interventional expectation function $E[y|do(x_i)]$ always deals with total effect of $x_i$ on $y$? 2) Structural parameters in linear SCM always carried out direct effect of unit change in causal variable ($x_i$) on effect variable ($y$)? 3) in model 1 above what is the value of $E[y|do(x_1)]=$? Is it the total effect? Thanks for your patience – markowitz Sep 21 '19 at 10:15
  • @markowitz you should open another question, – Carlos Cinelli Sep 21 '19 at 16:46
  • I opened it: https://stats.stackexchange.com/questions/467570/linear-causal-model – markowitz May 20 '20 at 17:34