I'm having a hard time checking an equality in https://arxiv.org/pdf/1511.01707.pdf. It is unnumbered, immediately before (24), on page 13. Any help would be appreciated. Here is some simplified notation and a simplified question.
Let
- $v^{(i)} > 0$ be the unnormalized weights
- $v_{\text{max}} = \max_i\{\log v^{(i)}\}$
- $\tilde{v}^i = \log v^{(i)} - v_{\text{max}}$ the shifted log-weights
- $i=1,\ldots,N$
Usually we want to calculate $\frac{1}{N}\sum_i v^{(i)}$. But they're saying it's more numerically stable to calculate $$ \log \left[ \frac{1}{N}\sum_i v^{(i)} \right] = v_{\text{max}} + \left[\sum_i \tilde{v}^i \right]- \log N. $$ Problem is I'm not getting that.
\begin{align*} \log \left[ \frac{1}{N}\sum_i v^{(i)} \right] &= - \log N + \log\left[\sum_i v^{(i)} \right] \\ &= - \log N + \log\left[\sum_i \exp \log v^{(i)} \right] \\ &= - \log N + \log\left[\sum_i \exp \left\{\log v^{(i)} + v_{\text{max}} - v_{\text{max}} \right\}\right] \\ &= - \log N + \log\left[\sum_i \exp \left\{ \tilde{v}^{(i)} + v_{\text{max}} \right\}\right] \\ &= - \log N + \log\left[\exp v_{\text{max}} \sum_i \exp \tilde{v}^{(i)} \right] \\ &= - \log N + v_{\text{max}} + \log\left[ \sum_i \exp \tilde{v}^{(i)} \right] \\ &\neq - \log N + v_{\text{max}} + \log\left[ \exp \sum_i \tilde{v}^{(i)} \right] \tag{?}\\ &= v_{\text{max}} + \left[\sum_i \tilde{v}^i \right]- \log N. \end{align*}
- What am I missing here?
- Why is it more numerically stable? Why are floating points numbers better approximations to real numbers when the real numbers are not super small?