1

We know if we try to get $D_{KL}(q||p)$, where $p$ is a standard normal distribution, so mean is 0, variance is the identity matrix, and $q$ is a multivariate normal distribution, it can be calculated as

$$-0.5 * (1 + log \space \sigma^2 - \mu^2 - \sigma^2)$$

From this question, I've seen it's broken down like this:

$$\begin{align} \mathfrak{D}_\text{KL}[q(z|x)\mid\mid p(z)] &= \frac{1}{2}\left[\log\frac{|\Sigma_2|}{|\Sigma_1|} - n + \text{tr} \{ \Sigma_2^{-1}\Sigma_1 \} + (\mu_2 - \mu_1)^T \Sigma_2^{-1}(\mu_2 - \mu_1)\right]\\ &= \frac{1}{2}\left[\log\frac{|I|}{|\Sigma|} - n + \text{tr} \{ I^{-1}\Sigma \} + (\vec{0} - \mu)^T I^{-1}(\vec{0} - \mu)\right]\\ &= \frac{1}{2}\left[-\log{|\Sigma|} - n + \text{tr} \{ \Sigma \} + \mu^T \mu\right]\\ &= \frac{1}{2}\left[-\log\prod_i\sigma_i^2 - n + \sum_i\sigma_i^2 + \sum_i\mu^2_i\right]\\ &= \frac{1}{2}\left[-\sum_i\log\sigma_i^2 - n + \sum_i\sigma_i^2 + \sum_i\mu^2_i\right]\\ &= \frac{1}{2}\left[-\sum_i\left(\log\sigma_i^2 + 1\right) + \sum_i\sigma_i^2 + \sum_i\mu^2_i\right]\\ \end{align}$$

Am I right that to earn what I want (which is $\mu_2$ isn't 0), I need to change this to:

$$\begin{align} \mathfrak{D}_\text{KL}[q(z|x)\mid\mid p(z)] &= \frac{1}{2}\left[\log\frac{|\Sigma_2|}{|\Sigma_1|} - n + \text{tr} \{ \Sigma_2^{-1}\Sigma_1 \} + (\mu_2 - \mu_1)^T \Sigma_2^{-1}(\mu_2 - \mu_1)\right]\\ &= \frac{1}{2}\left[\log\frac{|I|}{|\Sigma|} - n + \text{tr} \{ I^{-1}\Sigma \} + (\mu_2 - \mu_1)^T I^{-1}(\mu_2 - \mu_1)\right]\\ &= \frac{1}{2}\left[-\log{|\Sigma|} - n + \text{tr} \{ \Sigma \} + (\mu_2 - \mu_1)^T(\mu_2 - \mu_1)\right]\\ &= \frac{1}{2}\left[-\log\prod_i\sigma_i^2 - n + \sum_i\sigma_i^2 + \sum_i(\mu_2 - \mu_1)^2_i\right]\\ &= \frac{1}{2}\left[-\sum_i\log\sigma_i^2 - n + \sum_i\sigma_i^2 + \sum_i(\mu_2 - \mu_1)^2_i\right]\\ &= \frac{1}{2}\left[-\sum_i\left(\log\sigma_i^2 + 1\right) + \sum_i\sigma_i^2 + \sum_i(\mu_2 - \mu_1)^2_i\right]\\ \end{align}$$

How does the equation change if $\mu_2$ isn't 0?

Gergő Horváth
  • 205
  • 1
  • 8

0 Answers0