4

Currently I’m focused on linear causal model expressed as a structural equation like this:

$y = \beta_1 x_1 + \beta_2 x_2 + … + \beta_k x_k + u$

where $E[u|x_1,x_2,…,x_k]=0$ (exogenous error)

we don’t know nothing about about causal nexus and/or statistical dependencies among $x$s. However all variables involved ($x$s) are measurable and no others are relevant for explanation of $y$. The structural parameters $\beta_i$ are unknown constants.

I know that several DAG are compatible with the specification above (see this strongly related question for some examples: Causality: Structural Causal Model and DAG). Now the specification above is quite general but, if I have understood correctly, the following related statements are right:

1) The structural coefficients $\beta_i$ represent the direct causal effect of $x_i$ on $y$ (for $i=1,…k$) and we have $E[y|do(x_1,…,x_k)]=E[y|x_1,…,x_k]= \beta_1 x_1 + \beta_2 x_2 + … + \beta_k x_k$. Therefore those effect are identified. In other words all the direct effects are computable by the regression written.

2) If there aren’t causal nexus among $x$s and $x$s are statistical independent each others we have also that $E[y|do(x_i)]=E[y|x_i]=\beta_i x_i$ for $i=1,…k$. If some dependencies exist this conclusion is no more true.

3) If there aren’t causal nexus among $x$s the direct causal effect of $x_i$ on $y$ coincide with their total causal effects. Moreover the total is the effect that in experimental language is known as average causal effect (ACE) or average treatment effect on the treated (ATT); then what is usually intended as causal effect in econometrics and what backdoor criterion refers on.

4) If there are causal nexus among $x$s but we don’t now what they are, we cannot know what combination of structural parameters give us the total effects. Therefore is not possible to identify them.

5) if we know all the causal nexus among $x$s and there aren’t unobserved common cause or, equivalently, there are no related structural errors, then the causal effect (total and direct) are identifiable.

I made some mistakes? If yes can you give me some easiest as possible counterexample and, then, the correct statements?

EDIT: I edited the post deleting the two final sub-questions. I hope that now it sounds good for moderators.

markowitz
  • 3,964
  • 1
  • 13
  • 28
  • When you specify *exogenous error*, you do it probabilistically rather than causally/structurally as your expression does not involve a $\text{do}$ operator. Is that the correct way to do it? – Richard Hardy May 20 '20 at 17:50
  • @RichardHardy strictly speaking you are right, but here he mentioned that the equation is structural (not a regression equation), so in that sense these would be causal. – Carlos Cinelli May 20 '20 at 18:56
  • @Richard Hardy, What Carlos Cinelli wrote is exactly what I supposed valid. – markowitz May 20 '20 at 19:50
  • Here are some casual first impressions, not diligently thought through. In 1), precisely what does *without more informations* mean? Is it necessary?. In 2), I do not think this is true unless we assume all the other $x$s have zero means. 3) and 4) look good. 5) ?. 6) part 1 looks OK, part 2 ?. 7) ?. – Richard Hardy May 20 '20 at 19:56
  • @Richard Hardy, about my phrase that you underscore. It mean that no other assumpions are made other than which already given in the initial definitions and assumpions. Then the statements 1 would be notably general. – markowitz May 21 '20 at 05:46
  • @markowitz, do you have an example of a case where additional information / additional assumptions would make the equality in 1) fail? – Richard Hardy May 21 '20 at 06:15
  • @Richard Hardy, Let me underscore that 1 to 5 are only my guesses. Then I think that, taking apart pathological cases, for example perfect multicollinearity, 1) always holds. In any case take care that my guesses move in opposite direction. From 2 to 5 I add assumptions and, in add than their related statements I, implicitly, said that 1) holds yet. Moreover let me think about the opportunity to insert the constant in the regressions or add the assumption of zero mean for $x$s. – markowitz May 21 '20 at 07:52
  • @markowitz, I wonder if perfect multicollinearity should invalidate the equation in 1). I do not immediately see how it would. – Richard Hardy May 21 '20 at 08:35
  • It seems me necessary because in perfect multicollinearity cases joint distributions are not always well defined, then regression so. Multivariate Normal is an example, however seem that some remedy exist, se here https://en.wikipedia.org/wiki/Multivariate_normal_distribution#Degenerate_case In any case, OLS regression is not well defined in perfect multicollinearity case. I’m doubtfoul about intervenctional expectations, however I think that also them would become not well defined. – markowitz May 21 '20 at 09:21
  • @markowitz, the equality in 1) considers theoretical quantities, not regression estimates, so the discussion of regression is irrelevant with respect to it (regardless of whether the discussion is correct on its own). I think the equation holds regardless of perfect multicollinearity. In that sense, I think *without more informations* is irrelevant and may be misleading. – Richard Hardy May 21 '20 at 10:22
  • The problem can become too detailed. However, If we intend “OLS regression” only as estimation technique I can agree with you even if sample problems are relevant in practice. However we can interpret “OLS regression” also as best linear predictor and/or best linear approximation of population conditional distribution. It coincide with population conditional expectation if it is linear. Now, I read somewhere that in perfect multicollinearity case some multivariate distributions become not well defined. – markowitz May 21 '20 at 11:10
  • If this is true also the conditional expectation, in which we are interested in, become not well defined. About its linear approximation the fail is also more sure. In any case I'm not interested in detail like these here; definitely. Given assumptions and definitions above, I simply intended the 1) as correct in quite general sense. About the phrase that you underscore again, I said in the comments what I intended in not ambiguous way. However I see your point and I delete it from the text. – markowitz May 21 '20 at 11:11
  • @Richard Hardy, I forget, thank for the feedbacks. – markowitz May 21 '20 at 12:13
  • @markowitz, you are welcome! I found the question interesting. – Richard Hardy May 21 '20 at 13:24
  • 1
    Markowitz I will answer your question in a couple days, I’m really busy now. But overall you are getting most things correctly. – Carlos Cinelli May 21 '20 at 19:38
  • Hi Carlos, How are you? You didn't answer because you have nothing relevant to says or because you didn't have time yet? – markowitz Jun 22 '20 at 06:57

1 Answers1

2

By structural I will understand that the structural equation is encoding the averge response of Y when the x are manipulated, that is:

$$ E[Y|do(x_1, \dots, x_k)]= \beta_1x_1 + \dots + \beta_kx_k $$

So answering your questions:

  1. That's correct. The proof is simple, since

$$ E[Y|x_1, \dots, x_k] = \beta_1x_1 + \dots + \beta_kx_k + E[u|x_1, \dots, x_k] = \beta_1x_1 + \dots + \beta_kx_k $$

As you said, these are the controlled direct effects of each $x_i$ when holding the other $x_j$ fixed.

  1. If there are no causal effects among the $X$ and they are not confounded, then these coefficients are also the total effects. To see this, draw a DAG with all $X$ pointing to $Y$ and no arrow between the $X$. Note that to identify the total effect with $E[Y|x_i]$ alone you need that $X_i$ is unconfounded without conditioning on all the other $X$ as well.

  2. Correct.

  3. Correct. For an example, imagine the graph $X_1 \rightarrow X_2$, $X_2\rightarrow Y$ and $X_1 \rightarrow Y$. Here $X_2$ is a mediator, and the total and direct effects of $X_1$ on $Y$ are different. But you could just flip the positions of $X_1$ and $X_2$ and now $X_2$ is a confounder for $X_1$, and the total and direct effects of $X_1$ on $Y$ are the same.

  4. Correct. If you know the DAG and the model is Markovian (all errors are independent) then all causal effects (direct and indirect) are identified.

Carlos Cinelli
  • 10,500
  • 5
  • 42
  • 77
  • Hi Carlos, first of all thanks for your reply. The statement 2 is the only where you do not affirm clearly that it is correct. I ask you some clarification about that. Absence of causal nexus among $x$s is in the statement 2, you report it but add “… and they are not confounded, then these coefficients are also the total effects.” Now, if I no go wrong, “confounding variable” is a causal qualification for it that imply some nexus. – markowitz Jul 23 '20 at 10:27
  • Then absence of causal nexus imply that the variables are not confounded. Moreover the second part of your phrase that I report here represent the conclusion supposed in point 3, that requires precisely, and only, no causal nexus among $x$s. Your adding is only a semantic reinforcement or reveal some necessity? – markowitz Jul 23 '20 at 10:27
  • Again in point 2 I specify that the $x$s are independent each others, because absence of causal nexus among them not guarantee the statistical independence. However It seem me that this condition is not necessary for the validity of the equation written in point 2. Rather if mutual independence holds, several multivariate regression are also good for identify the $\beta$s. Agree? – markowitz Jul 23 '20 at 10:28
  • @markowitz I wasn't sure what you meant by causal nexus exactly, perhaps you meant just that one $X_i$ does not cause another $X_j$, so that's why I added the unconfoundedness; for instance, you could have a latent confounder causing both $X_i$ and $X_j$, in which case adjusting for both $X_i$ and $X_j$ recovers their causal effect, but adjusting for only one of them doesn't. – Carlos Cinelli Jul 23 '20 at 17:16
  • Precisely, for no causal nexus among $x$s I mean that no unidirectional arrow go from $x_i$ to $x_j$ for any $i$ and $j$. As latent confounder, between $x_i$ and $x_i$, I have to think something like related structural errors? Or at a third variable, say $z$, that causes both? Maybe those two things can conflate? – markowitz Jul 24 '20 at 09:47
  • @markowitz either way works, a latent confounder results in related structural errors. – Carlos Cinelli Jul 24 '20 at 18:55
  • Hi Carlos how are you? I have written a related discussion here (https://stats.stackexchange.com/questions/494625/whats-the-dgp-in-causal-inference). Your comments/suggestions would be welcome! – markowitz Nov 06 '20 at 17:13