5

Pearl et al. "Causal Inference in Statistics: A Primer" (2016) p. 61 presents the backdoor criterion:

Definition 3.3.1 (The Backdoor Criterion) Given an ordered pair of variables $(X,Y)$ in a directed acyclic graph $G$, a set of variables $Z$ satisfies the backdoor criterion relative to $(X,Y)$ if no node in $Z$ is a descendant of $X$, and $Z$ blocks every path between $X$ and $Y$ that contains an arrow into $X$. If a set of variables $Z$ satisfies the backdoor criterion for $X$ and $Y$, then the causal effect of $X$ on $Y$ is given by the formula $$ P(Y=y|do(X=x))=\sum_{z} P(Y=y|X=x,Z=z)P(Z=z) $$ just as when we adjust for $\text{PA}(X)$.

It then adds in parentheses:

Note that $\text{PA}(X)$ always satisfies the backdoor criterion.

The latter note is not obvious to me. Consider a DAG that is a simple chain $$ Z \rightarrow X \rightarrow Y $$ Here, $\text{PA}(X)=Z$. At the same time, there is no backdoor path between $X$ and $Y$, so $Z$ does not block any. Question: How come $Z$ satisfies the backdoor criterion then?

ColorStatistics
  • 2,699
  • 1
  • 10
  • 26
Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
  • 2
    I suppose it's simply a matter of reading $Z$ blocks every path between $X$ and $Y$ that contains an arrow into $X$ *if any exists*. – Tim Mak Apr 07 '20 at 07:14
  • 2
    @TimMak Exactly right. The empty set could serve as satisfying the backdoor criterion here, but adding that particular $Z$ doesn't hurt anything (that is, it doesn't open up previously closed paths via a collider). – Adrian Keister Apr 07 '20 at 13:10

2 Answers2

1

The reason that conditioning on the parents of $X$, irrespective of what the DAG looks like, always satisfies the backdoor criterion relative to $(X,Y)$ is that there is a parent of $X$ on each backdoor path and parents of $X$ cannot be colliders, by definition of parents of $X$ (which implies an arrow from the parent to $X$), hence conditioning on the the set of parents of $X$ will block all the backdoor paths, not open any spurious paths, and leave all directed paths untouched.


With regards to your specific question on this DAG: $Z \rightarrow X \rightarrow Y$: $Z$, the parent of $X$, does satisfy the backdoor criterion, albeit trivially. There is no backdoor path that remains open once we condition on $Z$; all directed paths from $X$ to $Y$ remain unperturbed; no new spurious paths are created. But, of course, the empty set also satisfies the backdoor criterion in this case.

However, there are at least 3 reasons why when interested in the causal effect of $X$ on $Y$, we would prefer to reduce the DAG you brought up $Z \rightarrow X \rightarrow Y$ to the following $X \rightarrow Y$ instead.

  1. $Y$ and $Z$ are independent conditional on X: $P(Y|X,Z)=P(Y|X)$. We gain nothing by conditioning on $Z$, once we've already conditioned on $X$. Put differently, $Z$ here is neutral in terms of bias reduction.
  2. Controlling for $Z$ will reduce the variation in $X$ and hence will reduce the precision of the estimate of the average causal effect.
  3. To the extent that there are unobserved common causes of $X$ and $Y$, controlling for Z will amplify the bias (due to the association via U).

enter image description here

See here for more on this.

ColorStatistics
  • 2,699
  • 1
  • 10
  • 26
1

The latter note is not obvious to me. Consider a DAG that is a simple chain $$ Z \rightarrow X \rightarrow Y $$ Here, $\text{PA}(X)=Z$. At the same time, there is no backdoor path between $X$ and $Y$, so $Z$ does not block any. Question: How come $Z$ satisfies the backdoor criterion then?

$Z$ satisfy the backdoor criterion because no backdoor paths between $X$ and $Y$ remain open in the DAG if we condition on $Z$.

Considering that we are interested in the (total) causal effect of $X$ on $Y$ a control set that contain $Z$ is a good control set. Moreover even the empty set is good, indeed even it deal with backdoor criterion.

If your concern is about the fairness of the definition you reported above, I suggest:

Given an ordered pair of variables $(X,Y)$ in a directed acyclic graph $G$, a set of variables $Z$ satisfies the backdoor criterion relative to $(X,Y)$ if conditioning on the control set $Z$ no directed/causal paths are blocked and no spurious/backdoor paths remain open.

markowitz
  • 3,964
  • 1
  • 13
  • 28