2

For conditional probability $P(D_1, D_2|W)$, suppose $D_1$ and $D_2$ are dependent (I mean not i.i.d.), then I am a bit confused whether $$P(D_1, D_2|W) = P(D_1|W) \ P (D_2|W)$$ or $$P(D_1, D_2|W) = P(D_1|W) \ P (D_2|W, D_1)$$

From Bayes theorem, the latter one is correct, but from logical thinking, I cannot tell what is wrong for the first formula, since its logical meaning $D_1$ happens under conditional of $W$, at the same time $D_2$ happens under conditional of $W$, seems correct to represent the logical meaning of $P(D_1, D_2|W)$, which is $D_1$ and $D_2$ happens at the same time under the condition of W?

What is wrong in my above analysis, which result in wrong decision $$P(D_1, D_2|W) = P(D_1|W) \ P (D_2|W)$$ when $D_1$ and $D_2$ are not independent?

hsnee
  • 103
  • 4
Lin Ma
  • 227
  • 3
  • 14

2 Answers2

6

It may be helpful to break the conditional probability in terms of the joint and marginal probabilities.

Starting from the definition of conditional probability, we have \begin{align} p(x,y \mid z) &= \frac{p(x,y,z)}{p(z)} \\ &= \frac{p(x,y,z)}{p(y,z)} \, \frac{p(y,z)}{p(z)} \\ &= p(x \mid y,z) \, p(y \mid z) \end{align} which shows that the correct formula is your second one. This is sometimes called the chain rule of probability. Note that it does not require any independence.

(Note that this is not Bayes' theorem, which uses the chain rule, but is fundamentally about "swapping variables across the conditional", i.e. $p(z|x,y)$ would be on the right hand side.)

As Tim notes, your alternate expression only holds in the case that $x$ and $y$ are conditionally independent, given $z$. Two variables $x$ and $y$ are independent if knowing the value of one gives no information about the value of the other, i.e. $p(x|y)=p(x)$ (and so $p(y|x)=p(y)$, by Bayes' theorem).

Now it could be that knowing $y$ does give information about $x$, but if $z$ also gives that same information, then given $z$ the value of $y$ gives no new information about $x$. This conditional independence would be expressed mathematically as $$p(x|z,y)=p(x|z)$$ This gives your alternate expression, when substituted into the more general formula above.

GeoMatt22
  • 11,997
  • 2
  • 34
  • 64
  • Excellent explanation GeoMatt22, studifed the "chain rule of probability" you referred and it seems a chain rule for joint probability (represent joint probability by multiply conditional probability). But my question is about how to represent conditional probability by other conditional probabilities, not sure if chain rule applies here? Thanks. – Lin Ma Sep 16 '16 at 06:37
  • 1
    @LinMa your question asks how to factor a conditional *joint* probability into a product of conditional probabilities. The chain rule tells you the answer, as I wrote above. The chain rule *always* applies. Conditional independence only applies *sometimes*. For more on conditional independence, you can see [this](http://math.stackexchange.com/questions/23093/could-someone-explain-conditional-independence) SE post (referenced in the Wikipedia link I gave). Note "joint" just means there is more than 1 variable left of the "|", and "conditional" means there are more than 0 variables to its right. – GeoMatt22 Sep 20 '16 at 00:51
  • Thanks GeoMatt, for SE post you ment post by @joriki? – Lin Ma Sep 20 '16 at 06:57
  • 1
    @LinMa my comment had a link in it, with the text "[this](http://math.stackexchange.com/questions/23093/could-someone-explain-conditional-independence)". It is to a related question on the *Math* SE. – GeoMatt22 Sep 20 '16 at 12:52
  • Thanks GeoMatt, I like the post you referred and vote up. But I disagree with the example of "a blue die and a red die", under the condition of knowing their sum is even, and if we know value of one die, we know there are only 3 (other than 6) possible values of the other die, it is 100% correct, but how does it links to independence of two die? I think it (know sum is even) does not break the assumption of two die are independent (for independent I mean the two die makes choices of their values independently)? Maybe we have different definition of what means independent here? – Lin Ma Sep 21 '16 at 06:17
  • (cont'd) For independence I mean even if we know their sum is even, the two die still have equal probability from 6 values, know their sum is even is just s post-event, it does not magically tweak the dies (kinds of casual) to make them only show only 3 other than 6 values. Please feel free to correct me if I am wrong. – Lin Ma Sep 21 '16 at 06:20
4

Basically, you are referring to conditional independence. Imagine that we have three events, $A,B,C$, we say that $A$ and $B$ are conditionally independent given $C$ if

$$ \Pr(A \cap B \mid C) = \Pr(A \mid C) \, \Pr(B \mid C) $$

so by using the first formula you are assuming conditional independence, what may, or may not be true for your data. This is related to the idea of exchangeability, in Bayesian statistics we often assume that our data is independent and identically distributed conditionally on parameters (see also the O'Neill, 2009 paper that Wikipedia refferes to).

To learn more refer to Wikipedia article and math.stackexchange.com thread that go into more details and provide multiple worked examples.


O'Neill, B. (2009). Exchangeability, Correlation and Bayes' Effect. International Statistical Review 77(2), 241–250.

Tim
  • 108,699
  • 20
  • 212
  • 390
  • Thanks Tim, I think if I can write $Pr(A,B|C) = Pr(A|C) * Pr (B|C)$, it means `A` and `B` are independent, correct? So, I think saying `A` and `B` are independent (under any condition) is ok, why you say that `A` and `B` are conditionally independent? – Lin Ma Sep 16 '16 at 06:40
  • 1
    @LinMa no. It means that they are independent *conditionally* on $C$. Unconditionally they *can* be dependent. Check e.g. https://books.google.pl/books?id=qPERhCbePNcC&pg=PA61&lpg=PA61&dq=bayesian+conditional+independent+iid&source=bl&ots=L9vW6pnsQK&sig=l_KWMXbLL0PcDuHnBSXM7nsLHBI&hl=pl&sa=X&ved=0ahUKEwiA_pnW-ZDPAhXJJJoKHVPSAy4Q6AEIQDAD#v=onepage&q=bayesian%20conditional%20independent%20iid&f=false and the links I provided. – Tim Sep 16 '16 at 07:15
  • Thanks Tim for the hint and vote up. I did some study and cannot figure out an example, when `A` and `B` are dependent, but under `C`, they are independent, it sounds a bit weird to me since I think if `A` and `B` are dependent, they should be universal dependent under any conditions. Is it possible you could show an example, when `A` and `B` are dependent, but under a specific random event `C`, `A` and `B` become independent under `C`? Thanks and sorry for the dumb question. – Lin Ma Sep 17 '16 at 06:17
  • 1
    @LinDa have you checked the links I posted in my answer and comment? They contain multiple examples. Also if you are not familiar with basic probability topics then maybe you should start with portability handbook? – Tim Sep 17 '16 at 11:30
  • Thanks Tim, when I open the link, it goes to section 4.2 Exchangeability, is that what you intend to show me? I could be wrong, but I do not find the relationship between the chapter you referred and my question, my question is I want to find a case when A and B are dependent, but under condition C, they are independent. – Lin Ma Sep 18 '16 at 00:32
  • BTW, Tim, I read some more parts on the book and I find the problem seems not the same, for example in example 4.2.2. it is talking about conditional independency based on parameter $\theta$. But actually my question is, for A and B as two random event, is there an example A and B are dependent, but under condition C (C is also a random event), A and B are dependent. I think questions are different in terms of whether under condition of a parameter (i.e. $\theta$) or under a random event (i.e. C). If I read your intention wrong, please feel free to correct me. – Lin Ma Sep 18 '16 at 02:12
  • 1
    @LinMa please check also the links in my answer. Together with the link in the comment they extend and illustrate what I've written in my answer. They also give examples for what you're asking. – Tim Sep 18 '16 at 06:54
  • Thanks Tim, take time to study, I like the post you referred and vote up. But I disagree with the example of "a blue die and a red die", under the condition of knowing their sum is even, and if we know value of one die, we know there are only 3 (other than 6) possible values of the other die, it is 100% correct, but how does it links to independence of two die? I think it (know sum is even) does not break the assumption of two die are independent (for independent I mean the two die makes choices of their values independently)? Maybe we have different definition of what means independent here? – Lin Ma Sep 22 '16 at 06:04
  • (cont'd) For independence I mean even if we know their sum is even, the two die still have equal probability from 6 values, know their sum is even is just s post-event, it does not magically tweak the dies (kinds of casual) to make them only show only 3 other than 6 values. Please feel free to correct me if I am wrong. – Lin Ma Sep 22 '16 at 06:04
  • 1
    @LinMa it is hard to give clear and intuitive example of conditional independence. You could check this one http://stats.stackexchange.com/questions/125683/pearson-correlation-has-quizzy-results/125686#125686 it is not about conditional independence, but shows how conditioning changes relationships. I'm sorry but it's hard for me to give better explanations then the ones given. – Tim Sep 22 '16 at 19:45
  • No problem Tim, you already helped a lot and mark your reply as answer. – Lin Ma Sep 23 '16 at 07:39