Relationship between Poisson generation and generalized Kullback-Leibler divergence

Question

I have read that, in the context of matrix factorization, performing maximum likelihood estimation under the assumption that the entries are Poisson generated is equivalent to minimizing the generalized Kullback-Leibler divergence between the matrix and the product of its factors.

Does anyone know how to show this?

(for an example of this claim, see, e.g. page 2 of Lee and Seung's paper on NMF in Nature.)

score 4 · Answer 1 · answered Feb 17 '15 at 17:51

I worked it out in the end and I'll link to it here, in case someone else is interested. I wrote it up here http://building-babylon.net/2015/02/17/maximum-likelihood-estimation-for-non-negative-matrix-factorisation-and-the-generalised-kullback-leibler-divergence/

alberto · Answer 2 · 2019-01-22T12:36:09.417

We want to proof that \begin{align} argmin_{W,H} \qquad D_{KL}(\boldsymbol{V} | \boldsymbol {WH}) \quad = \quad argmax_{W,H} \qquad p(\boldsymbol{V} | \boldsymbol{W}\boldsymbol{H}) \end{align} under a Poisson distribution.

The KL divergence (actually the I-divergence) is defined as \begin{align} D_{KL}(\boldsymbol{V} | \boldsymbol{W}\boldsymbol{H}) = \sum_f \sum_n \left[ v_{fn} \ln \frac{v_{fn}}{\sum_k w_{fk}h_{kn}} + \sum_k w_{fk}h_{kn} - v_{fn} \right] \end{align} And the likelihood can be expressed in terms of the KL divergence: \begin{align} \ln p(\boldsymbol{V} | \boldsymbol{W}\boldsymbol{H}) &= \sum_{f,n} \ln \left[ \exp\left\lbrace -\sum_k w_{fk}h_{kn}\right\rbrace \frac{(\sum w_{fk}h_{kn})^{v_{fn}}}{v_{fn}!} \right] \\ &= \sum_{f,n} \left[ {v_{fn}} \ln\sum w_{fk}h_{kn} -\sum_k w_{fk}h_{kn} - \ln v_{fn}! \right] \\ &= \sum_{f,n} \left[ {v_{fn}} \ln\frac{\sum w_{fk}h_{kn}}{v_{fn}} -\sum_k w_{fk}h_{kn} - \ln v_{fn}! +v_{fn}\ln v_{fn} \right] \\ &= -D_{KL}(\boldsymbol{V} | \boldsymbol{W}\boldsymbol{H}) - v_{fn} + \left[ v_{fn}\ln v_{fn} - \ln v_{fn}! \right] \end{align} Therefore, both the likelihood and the KL divergence have the same optimum with respect to $\boldsymbol{W}, \boldsymbol{H}$.

Relationship between Poisson generation and generalized Kullback-Leibler divergence

2 Answers2

Linked