5

I am reading Pearl's causality book and it states,

Identifiability ensures that the added assumptions conveyed by $M$ ... will supply the missing information without explicating $M$ in detail.

However, I still do not understand the requirement for causal identification and would appreciate some elaboration of the above statement.

tdy
  • 313
  • 7
desert_ranger
  • 245
  • 1
  • 8
  • 1
    Causal identification is equivalent to being able to estimate a causal quantity in terms of observed data. That passage "not explicating M in detail" means just using observed data. – user551504 Nov 19 '21 at 15:14
  • 1
    Try doing a deep dive on Lewbel's "The Identification Zoo: Meanings of Identification in Econometrics". – Zen Nov 19 '21 at 18:35

3 Answers3

1

In causal inference you can think about identifiability as the condition that permit to measure causal quantity from observed data. Among parametric models is the condition that permit to estimate causal parameters from regressional. Formally

$E[Y|do(X)] = E[Y|X] $

Can be consider as identifiability condition. Identifiability condition is one key point of any causal model. In general if you do not have enough assumptions and/or data identification is not possible.

For examples about it read here:

In Berkson's paradox, is $\beta_1 = 0$ or $\ne 0$?

Infer one link of a causal structure, from observations

markowitz
  • 3,964
  • 1
  • 13
  • 28
1

Let's say you have a Treatment variable, an Outcome variable and numerous other variables. One could do a regression of one on the other, adjusting by everything else, but we're smarter than this, right? How do we know this measures the direct relationship between treatment and outcome? Maybe we're not adjusting for an important confounder. Maybe we're adjusting for a collider and worsening our estimate, instead of what we really want.

The causal identification step is important to see if it's possible to estimate the effect of Treatment on Outcome. And if it is, how we can do so (backdoor adjustment, frontdoor adjustment, and so on). Sometimes it is not identifiable, and there is nothing we can do :|. Once the identification step is done, you can estimate the causal effect.

mribeirodantas
  • 796
  • 3
  • 17
1

This is my understanding. Correct me if I am wrong.

Suppose I am conducting a randomized clinical trial to investigate the difference in means between two treatments, A and B. If I randomize everyone to treatment B with probability 1 then the population mean for treatment A and the difference in means are both unidentifiable. There is no data available to estimate these population quantities.

Suppose I randomize subjects to treatment B with probability 1/2 and during the course of the trial some subjects switch to other therapies. The causal treatment effect (difference in population means between treatment A and B) in the envisioned scenario where post baseline treatment switching does not occur is unidentifiable using the observed data.

In both examples I could use an unverifiable missing data assumption that does make the population treatment effect identifiable.

Geoffrey Johnson
  • 2,460
  • 3
  • 12