2

The following site (http://www-ist.massey.ac.nz/dstirlin/CAST/CAST/Hcausal/causal_c2.html) defines a causal relationship as one where one variable 'directly' affects the other, but without the other variable having any influence on the first variable. However, the following site (http://medical-dictionary.thefreedictionary.com/direct+causal+association) defines a direct causal relationship as one where one variable causes a change in the other and there are no intervening variables.

However, both definitions are different. You can have a relationship with no intervening variables, but one where both variables directly affect each other. One wouldn't dare call this a correlation, but according to the first definition that wouldn't be a causation either.

Moreover, I assume that there are relationships where there are intervening variables but where one variable can cause a change in the other without the reverse being possible. According to the second definition, this wouldn't be a causation.

This begs the question: is there a difference between a causal relationship and a DIRECT causal relationship?

Mathematician
  • 153
  • 2
  • 7
  • 2
    I'd love to read the answers of others, but this distinction may in practice be more aesthetic, more of a judgement call? If I press the elevator call button and the elevator door opens, is that a direct causal effect? What if I tell Sam to press the elevator button? What if I tell Sam to go grab lunch (and grabbing lunch involves taking the elevator)? Clearly the effect is causal in each case, but when would calling the elevator be a direct effect? Almost anytime A causes C, you can more precisely define the relationship and find a B so that A causes B causes C. – Matthew Gunn Apr 01 '17 at 18:21
  • 1
    @MatthewGunn When you press the button in the "minimal intervention" sense (which usually everyone does in the real world), and the door opens, this is by no means a "direct" effect (because you only change one variable - the button). Similar for "telling Sam", and for telling him to grab lunch. In all these cases, it would be a "total" causal effect. Compare this to pushing the button, but at the same time (or shortly before) cutting the wire from the button to the circuit running the elevator. You will get a different effect. – Julian Schuessler Apr 01 '17 at 18:27
  • 1
    @JulianSchuessler But if I squeezed wire cutters to cut a wire, then cutting the wire isn't a direct effect because there was the intervening mechanism of the fulcrum to redirect force? What if the cut was small enough so that electricity could still travel through the circuit? (There's an intermediate variable of how large the cut is.) This "direct" vs. "indirect" distinction seems like something English majors and lawyers may argue over endlessly rather than a cleanly defined physics or mathematical concept. – Matthew Gunn Apr 01 '17 at 18:54
  • @MatthewGunn It would be a direct effect because you changed two variables: the button and the wire, where the wire transmits the effect of the button on the elevator. See my answer below. Also, controlled/natural direct and indirect effects have found exact mathematical definitions, are highly policy-relevant, and object of ongoing research in a bundle of quantitative disciplines. See for example Pearl, Judea 2001, Direct and indirect effects, UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence. – Julian Schuessler Apr 01 '17 at 19:46
  • 1
    @JulianSchuessler Judea Pearl defines natural direct effects etc... relative to the "setting $Z=z$ of *other variables* in the model [emphasis added]." Including different variables $Z$ in the model can change whether $X$ has a direct effect on $Y$ under Pearl's definitions. Maybe this is a stupid/obvious point since everything in statistics is relative to some model, but I think it's worth mentioning that what's direct depends on what variables you include in the model, and what variables you should include is often unclear, open to debate. – Matthew Gunn Apr 02 '17 at 00:36
  • @MatthewGunn Yes. The more recent literature, however, usually defines such effects with respect to a treatment and one or a few more potential moderators (e.g., excluding covariates), because this is what policy-makers can at most control, or what researchers are interested in for scientific reasons. You might find Pearl 2014 "Interpretation and identification of causal mediation." Psychological Methods interesting in that regard – Julian Schuessler Apr 02 '17 at 09:06

2 Answers2

5

Causal relationships can be direct or indirect. That is, direct causal relationships are a special case of causal relationships. For example, in $A \rightarrow B \rightarrow C$, the node $A$ is a cause of $C$ but it affects $C$ through $B$, so although there is a causal relationship between $A$ and $C$, this is an indirect causal relationship.

About the relationship between causation and correlation, note that "correlation does not imply causation". For example, in $B \leftarrow A \rightarrow C$, $B$ and $C$ are correlated but do not have any causal relationship.

Hossein
  • 3,170
  • 1
  • 16
  • 32
  • So in that case, all correlations are essentially causation (except in extreme cases where there is absolutely no link between two correlated variables), but one must identify the causal link before he states that. Correct? – Mathematician Apr 01 '17 at 06:49
  • Definitely not. Correlation does not imply causation. For example, in $B \leftarrow A \rightarrow C$, nodes $B$ and $C$ are correlated but there is no causal relationship between them. There should be a directed path between two variables to conclude causation. – Hossein Apr 01 '17 at 06:51
  • Also, there was another part to my question: what of C can somehow cause A as well, through node D, let's say. Would it still be called a causal relationship? Or if C could influence A through node B, would it still be called a causal relationship? From what I understand, a correlation remains a correlation until you either establish no causal link or some form of causal link. – Mathematician Apr 01 '17 at 06:56
  • Are you asking about $C \rightarrow D \rightarrow A$? Here, $C$ and $A$ have a causal relationship, but an indirect one. About the relationship between causation and correlation, I updated the answer. – Hossein Apr 01 '17 at 07:17
  • Indeed, but I have also asked about whether or not a causal relationship would be causal if A and C were linked through B. Basically, can a causal relationship go both ways? – Mathematician Apr 01 '17 at 08:26
  • There is wonderful correlation between [piracy and global warming], but it is doubtful there is any causation. While the O-ring failure cause the Space Shuttle Challenger disaster, there is little correlation between failed o-rings and failed shuttle launches. – Tavrock Apr 01 '17 at 08:34
  • Is there a numerical-statistical-probabilistic definition of causation? All I saw so far are word examples or arrows C→D→A , both of which are not numerical. – Zahava Kor Apr 01 '17 at 15:07
  • @Tavrock Why there is little correlation between failed o-rings and failed shuttle launches? I think they are highly correlated, because if we have some evidence about the o-ring failure, the probability of the disaster would be increased. – Hossein Apr 01 '17 at 16:05
  • @Mathematician I think the subject of causal loops is more a philosophical question. – Hossein Apr 01 '17 at 16:12
  • @Hossein: nearly every shuttle launch had at least one failed O-ring; several had multiple failed O-rings. They had redundant O-rings because failure of the O-ring was so common. It would be like arguing the correlation of launches to failures. If there was never a launch, there probably would have been fewer failures (thought Apollo I proved that isn't always the case). You can do a BLR between temperature and the number of failed O-rings to see that the risk was much higher when the Challenger disaster happened, but no patterns emerge if you just look at O-ring failures and disasters. – Tavrock Apr 01 '17 at 17:07
  • Thanks Tavrock. I would like to know whether this sentence is true or not: "Causation always implies correlation". According to your example, we cannot say that o-ring failure is the cause of the explosion since there are many redundant o-rings and the failure of one o-ring has no effect on the explosion. Therefore, this example is a case for "no causation -> no correlation". If all o-rings are failed, I think it would cause the explosion. In that case, I think, there is a correlation between the failure of **ALL** o-rings and the explosion. So this is an example of "causation -> correlation" – Hossein Apr 01 '17 at 17:23
  • @Hossein Causation does not always imply correlation. Let $X$ be a standard normally distributed random variable. I'll then square $X$ and write down the result of $Y = X^2$. Then you have a causal relationship between $Y$ and $X$ but $Y$ and $X$ are mathematically uncorrelated. – Matthew Gunn Apr 01 '17 at 18:27
  • 1
    @MatthewGunn That's technically correct, although I think when people say "correlation does not imply causation", they use correlation very loosely and actually mean "probabilistic dependence does not imply causation". In your example, of course, X and X^2 will be uncorrelated, but still be dependent. – Julian Schuessler Apr 01 '17 at 18:35
  • @JulianSchuessler Yes. Correlation measures only a *linear* relationship between two random variables. And many somewhat imprecisely use zero correlation as synonymous with *no* relationship. – Matthew Gunn Apr 01 '17 at 18:59
  • @MatthewGunn Thanks. But why $X$ and $X^2$ are not correlated? When you know $X$, you completely know everything about $X^2$, so they are completely correlated. – Hossein Apr 01 '17 at 19:05
  • 1
    @Hossein You're misunderstanding what correlation measures. It doesn't measure dependence. It measures a linear relationship. It's a rescaled covariance. The linear relationship between $X$ and $X^2$ if $X \sim \mathcal{N}(0, 1)$ is zero. Apply a simple identity: $\operatorname{Cov}(X, X^2) = \operatorname{E}[X^3] -\operatorname{E}[X]\operatorname{E}[X^2] $. Then if $X$ is a mean zero , normally distributed random variable, we have $\operatorname{E}[X^3] = 0$ (i.e. $X$ has zero skewness) and $\operatorname{E}[X] = 0$ hence $\operatorname{Cov}(X, X^2) = 0$ and $\operatorname{Corr}(X, X^2) = 0$ – Matthew Gunn Apr 01 '17 at 19:15
  • @Hossein Another way to think about it is that if you draw a best fit line between $Y = X^2$ and $X$ for $X \sim \mathcal{N}(0, 1)$, the best fit line is a flat line (i.e. zero slope). Some other examples: http://stats.stackexchange.com/a/22929/97925 – Matthew Gunn Apr 01 '17 at 19:31
  • @MatthewGunn Yes, that seems a correct counterexample for "causation implies correlation". The answer is modified. – Hossein Apr 01 '17 at 20:10
  • It's not really clear what you mean by the arrows between variables. – swmo Apr 02 '17 at 09:05
  • @Hossein As I tried to make clearer in point 6 of my answer, causation does NOT imply dependence without further assumptions. – Julian Schuessler Apr 02 '17 at 09:18
  • @swmo Arrows between variables mean (potential) causal influence, in the sense of point 2 of my answer. – Julian Schuessler Apr 02 '17 at 09:19
4

Both sources you link to are pretty bad.

The clearest approach to causality is the one using structural equations, potential outcomes, and causal graphs [1]. In that appraoch:

  1. Causal effects are assumed to exist even if one may not be able to determine them. This is just logical: One needs to define causality first before one can even think about identifying it from data.

  2. Causal effects are defined as the result of minimal external interventions to the value of a variable. E.g., the causal effect of variable A on variable B in unit $i$ is defined as $B^{a}_{i} - B^{a'}_{i}$, where $a, a'$ are two different fixed values. If for at least one unit and two different $a, a'$, these hypothetical values of B differ, A is said to have an effect on B.

  3. Direct causal effects and what you might call "total" causal effects are both causal, but potentially different. The total causal effect is what I just described. Regarding direct causal effects, these are usually defined with respect to intervening/mediating variables, and there are actually two distinct types of them. One of them is the controlled direct effect. This could be $B^{a, m}_{i} - B^{a', m}_{i}$: One compares two hypothetical outcomes for different values $a, a'$, but the same fixed value $m$ for the mediator M (mediator as in $A \rightarrow M \rightarrow B$). The natural direct effect is defined as $B^{a, {m^{a}_{i}}}_{i} - B^{a', {m^{a}_{i}}}_{i}$: A still switches from a to a', but M is fixed at its hypothetical value under the intervention $A = a$.

  4. Obviously, researchers might know that a total causal effect exists and what its magnitude is, but they might not have a single clue about the mechanism/direct effects of a variable
  5. There may be causal loops, e.g. A influencing B and B influencing A, although these (usually?) cannot occur simultaneously, so the "loop" is an approximation that neglects the time lag by which B influences A back and so and so forth. Classic example: The mutual impact of prices and quantities of a product in an economy.
  6. Although it is unlikely and often ruled out by assumption, there may be causation without correlation (or rather: without dependence). This can be, for example, when confounding exactly cancels out the dependence induced by causation.

To expand on the last point, you may you measure $P(Y|D = 1) - P(Y|D = 0)$ as the dependence between treatment D and outcome Y. By fundamental properties of counterfactuals and some algebra, this is $P(Y^{1}|D = 1) - P(Y^{0}|D = 0) = $

$P(Y^{1}) - P(Y^{0}) + (P(Y^{0}|D = 1) - P(Y^{1}|D = 0))$

where $P(Y^{1}) - P(Y^{0})$ is the true causal effect and $P(Y^{0}|D = 1) - P(Y^{1}|D = 0)$ is a bias term due to confounding. These two terms might exactly cancel out, so that $P(Y|D = 1) - P(Y|D = 0)$ = 0 even though $P(Y^{1}) - P(Y^{0}) \neq 0$. Again, this might be unlikely if the system you study is complex, but it may happen in some circumstances, regardless of how much data you have.

[1] Pearl, Judea 2009. Causality. Cambridge University Press.

Julian Schuessler
  • 2,025
  • 11
  • 16