I am reading a book and saw the following equation: $$ P(X|\theta) = \sum_{z}P(X|z,\theta)P(z|\theta)$$ I know that that it is: $$ P(X) = \sum_{z}P(X|z)P(z)$$ But I don't know how the above equation with conditional probability and three variables is true. Could some show how we get from the left side to the right side?
2 Answers
One way to think, and I think it is the easier way, is to imagine that we can add any number of RVs to the given side of the probability formulas, i.e. we'd also have $$P(X|\theta_1,\theta_2)=\sum_z P(X|z,\theta_1,\theta_2)P(z|\theta_1,\theta_2)$$
Or, you can multiply both sides of the above equation with $P(\theta)$ and have $$P(X,\theta)=\sum_z P(X|z,\theta)P(z,\theta)=\sum_z P(X,z,\theta)$$ which is actually the marginalization.

- 49,700
- 3
- 39
- 75
As noted here,
any rule, theorem, or formula that you have learned about probabilities is also applicable if everything is assumed to be conditioned on the occurrence of some event. For example, knowing that $$P(B^c) = 1-P(B)$$ allows us to immediately conclude that $$P(B^c\mid A) = 1 - P(B\mid A)$$ is a valid result without going through writing out the formal definitions and completing a proof of the result.
So, apply this notion to what you know, viz., $$ P(X) = \sum_{z}P(X\mid z)P(z)$$ to get $$ P(X\mid\theta) = \sum_{z}P(X\mid z,\theta)P(z\mid\theta).$$ All you are doing is conditioning everything on the new variable or event $\theta$.
Don't believe all this high-faluting nonsense? Then just grind it out the hard way by using all the definitions: \begin{align} \sum_{z}P(X|z,\theta)P(z|\theta) &= \sum_{z}\frac{P(X, z,\theta)}{P(z,\theta)}\times \frac{P(z,\theta)}{P(\theta)}\\ &= \sum_{z}\frac{P(X, z,\theta)}{P(\theta)}\\ &= \frac{1}{P(\theta)}\sum_{z} {P(X, z, \theta)}\\ &= \frac{P(X, \theta)}{P(\theta)}\\ &= P(X\mid\theta). \end{align}

- 41,202
- 4
- 94
- 200
-
You said: "any rule, theorem, or formula that you have learned about probabilities is also applicable if everything is assumed to be conditioned on the occurrence of some event". Does it not mean that if we condition $P(X) = \sum_{z}P(X\mid z)P(z)$ with $\theta$, it results in $ P(X\mid\theta) = \sum_{z}P(X\mid z|\theta)P(z\mid\theta)$, although I don't know if $P(X\mid z|\theta)$ is still possible. If not how do you get from $P(X\mid z|\theta)$ to $P(X\mid z,\theta)$ – Code Pope Jul 22 '19 at 08:12
-
@CodePope The probability in question is that of $X$, not of the _conditioning_ event or variable $z$, and so the conditional probability _of_ $A$, given $z$ _and also_ $\theta$ is $P(A \mid z, \theta)$ where the _comma_ on the right side of the $\mid$ is interpreted as intersection or AND. This is the general rule: everything to the right of $\mid$ is the _conditioning_ event. If you don't like this, see the more formal calculation towards the end of my answer, See also the answer by gunes which explicitly adds a new conditioning event. – Dilip Sarwate Jul 22 '19 at 13:51