As an exercise to improve my skills in PyMC (Python's Markov chain Monte Carlo library), I am trying to implement Latent Dirichlet Allocation as described here: https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation.
The model can be compactly described as
$$\boldsymbol\phi_{k=1 \dots K} \sim \operatorname{Dirichlet}_V(\boldsymbol\beta)\\ \boldsymbol\theta_{d=1 \dots M} \sim \operatorname{Dirichlet}_K(\boldsymbol\alpha)\\ z_{d=1 \dots M,w=1 \dots N_d} \sim \operatorname{Categorical}_K(\boldsymbol\theta_d) \\ w_{d=1 \dots M,w=1 \dots N_d} \sim \operatorname{Categorical}_V(\boldsymbol\phi_{z_{dw}}) $$
I came up with the following toy code:
import numpy as np
import pymc as pm
K = 2 # number of topics
V = 4 # number of words
D = 3 # number of documents
data = np.array([[1, 1, 1, 1], [1, 1, 1, 1], [0, 0, 0, 0]])
alpha = np.ones(K)
beta = np.ones(V+1)
theta = pm.Container([pm.Dirichlet("theta_%s" % i, theta=alpha) for i in range(D)])
phi = pm.Container([pm.Dirichlet("phi_%s" % k, theta=beta) for k in range(K)])
Wd = [len(doc) for doc in data]
z = pm.Container([pm.Categorical('z_%i' % d,
p = theta[d],
size=Wd[d],
value=np.random.randint(K, size=Wd[d]),
verbose=1)
for d in range(D)])
w = pm.Container([pm.Categorical("w_%i" % d,
p = pm.Lambda('phi_z_%i' % d, lambda z=z, phi=phi: [phi[z[d][i]] for i in range(Wd[d])]),
value=data[d],
observed=True,
verbose=1)
for d in range(D)])
model = pm.Model([theta, phi, z, w])
mcmc = pm.MCMC(model)
mcmc.sample(100, burn=10)
The tricky part is within $w_{d=1 \dots M,w=1 \dots N_d} \sim \operatorname{Categorical}_V(\boldsymbol\phi_{z_{dw}})$. Given the output of the sampling, I must be doing something wrong, because the model does not converge and I get many warnings about the probabilities in categorical_like that do not sum to one.
Is there a PyMC expert around who can shed some light on this all?