17

I have seen the $do(x)$ operator everywhere in some literature review I am doing on Causality (see, for instance this wikipedia entry). However, I cannot find a formal and general definition of this operator.

Can someone point me to a good reference on this? I am interested in a general definition rather than its interpretation in a particular experiment.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Judio
  • 173
  • 1
  • 4

2 Answers2

14

A probabilistic Structural Causal Model (SCM) is defined as a tuple $M = \langle U, V, F, P(U) \rangle$ where $U$ is a set of exogeneous variables, $V$ a set of endogenous variables, $F$ is a set of structural equations that determines the values of each endogenous variable and $P(U)$ a probability distribution over the domain of $U$.

In a SCM we represent the effect of an intervention on a variable $X$ by a submodel $M_x = \langle U, V, F_x, P(U) \rangle$ where $F_x$ indicates that the structural equation for $X$ is replaced by the new interventional equation. For example, the atomic intervention of setting the variable $X$ to a specific value $x$ --- usually denoted by $do(X = x)$ --- consists of replacing the equation for $X$ with the equation $X = x$.

To make ideas clear, imagine a nonparametric structural causal model $M$ defined by the following structural equations:

$$ Z = U_z\\ X = f(Z, U_x)\\ Y = g(X,Z, U_y) $$

Where the disturbances $U$ have some probability distribution $P(U)$. This induces a probability distribution over the endogenous variables $P_M(Y, Z, X)$, and in particular a conditional distribution of $Y$ given $X$, $P_M(Y|X)$.

But notice $P_M(Y|X)$ is the "observational" distribution of $Y$ given $X$ in the context of model $M$. What would be the effect on the distribution of $Y$ if we intervened on $X$ setting it to $x$? This is nothing more than the probability distribution of $Y$ induced by the modified model $M_x$:

$$ Z = U_z\\ X = x\\ Y = g(X, Z, U_y) $$

That is, the interventional probability of $Y$ if we set $X= x$ is given by the the probability induced in submodel $M_x$, that is, $P_{M_x}(Y|X=x)$ and it's usually denoted by $P(Y|do(X = x))$. The $do(X= x)$ operator makes it clear we are computing the probability of $Y$ in a submodel where there is an intervention setting $X$ equal to $x$, which corresponds to overriding the structural equation of $X$ with the equation $X =x$.

The goal of many analyses is to find how to express the interventional distribution $P(Y|do(X))$ in terms of the joint probability of the observational (pre-intervention) distribution.

do-calculus

The do-calculus is not the same thing as the $do(\cdot)$ operator. The do-calculus consists of three inference rules to help "massage" the post-intervention probability distribution and get $P(Y|do(X))$ in terms of the observational (pre-intervention) distribution. Hence, instead of doing derivations by hand, such as in this question, you can let an algorithm perform the derivations and automatically give you a nonparametric expression for identifying your causal query of interest (and the do-calculus is complete for recursive nonparametric structural causal models).

Carlos Cinelli
  • 10,500
  • 5
  • 42
  • 77
  • I think you may be among the few on cross validated who might be interested in and able to answer this question: https://stats.stackexchange.com/q/444249/62396 – joshphysics Jan 11 '20 at 00:04
  • I know it is a very minor detail, but why must a probabilistic model be a 'tuple' instead of a 'set'? In which instances would we have to protect against repetition of elements or care about its order? – Kuku Nov 01 '21 at 10:59
12

That is $do$-calculus. They explain it here:

Interventions and counterfactuals are defined through a mathematical operator called $do(x)$, which simulates physical interventions by deleting certain functions from the model, replacing them with a constant $X = x$, while keeping the rest of the model unchanged. The resulting model is denoted $M_x$.

mbiron
  • 399
  • 3
  • 11