Bayes rule when data $D$ is split into two independent parts: $D_a, D_b$

Question

In this machine learning paper Overcoming catastrophic forgetting in neural networks, they present to you equation 1, the log of bayes rule:

$$ \log p(\theta|D) = \log p(D|\theta) + \log p(\theta) - \log p(D) \quad (1)$$

Where $\theta$ are the machine learning parameters and $D$ is the dataset you are fitting to. OK, i get that. Then they say that "the data is split into two independent parts, one defining task A ($D_a$) and the other task B ($D_b$). Then, we can re-arrange equation 1:"

$$ \log p(\theta|D) = \log p(D_b|\theta) + \log p(\theta|D_a) - \log p(D_b) \quad (2) $$

Can someone guide me step by step how they go from (1) to (2)? Many thanks in advance!

score 3 · Accepted Answer · answered Feb 01 '22 at 21:58

Assuming both total and conditional independence, you have the following relationships: $$p(D)=p(D_a)p(D_b), \ \ p(D|\theta)=p(D_a|\theta)p(D_b|\theta)$$

Using these, we can write the original Bayes rule as $$p(\theta|D)=\frac{p(D_a|\theta)p(D_b|\theta)p(\theta)}{p(D_a)p(D_b)}=\frac{p(\theta|D_a)p(D_a)p(D_b|\theta)}{p(D_a)p(D_b)}=\frac{p(\theta|D_a)p(D_b|\theta)}{p(D_b)}$$

When taken the logarithm of both sides, you obtain (2).

Bayes rule when data $D$ is split into two independent parts: $D_a, D_b$

1 Answers1