From sample $Y$ which contains record of 10m people, I empirically know the following conditional probabilities for different disease conditions:
$P(Death|Condition A) = 5\%$
$P(Death|Condition B) = 2\%$
$P(Death|Condition C) = 2\%$
From Sample $X$ which contains 1m people (might overlap with Sample $Y$, might not, I don't know) I also know the following:
$P(Death|Condition D) = 1\%$
$P(Death|Condition E) = 0.1\%$
$P(Death|Condition F) = 3\%$
Now, how can I calculate the probability of death given two (or more) of those conditions exist:
- $P(Death|Condition B, Condition C)$ = ? (both from sample $X$)
- $P(Death|Condition A, Condition E)$ = ? (Across sample $X$ and Sample $Y$, Sample $Y$ is not necessarily a subset of Sample $X$)
- $P(Death|Condition A, Condition E, Condition C)$ = ? (Generalised version of 1 and 2)
Also, for majority of conditions I have no prior knowledge whether they are independent or dependent.
Any help is much appreciated in advance!
I did find this thread. But I still have two questions:
- What are the implications of conditions coming from two different samples with different sample sizes?
- How do you estimates $P(Death)$ and $P(Condition)$?