I'm trying to understand how to reestimate parameters, as part of the EM algorithm. As a simple example, I'm trying to derive the reestimation formula for an exponential distribution. Here's the setup:
Suppose we have an observation sequence of positive real numbers $\{x_i: i=1,2,...n\}$. Each observation $x_i$ could have come from any one of a set of states. Let $s_i$ be the state of $i^{\text{th}}$ observation. Assume we know/have estimated the probabilities of each observation being in each of the states.
Now assume that in state 1, $x_i$ has an exponential distribution: $(1/t)e^{-x_i/t}$, where t is an unknown parameter. The goal is to find the reestimation formula for $t$.
I think the quantity we have to maximize is the following:
$\prod_i P(s_i=1)(1/t)e^{-x_i/t}$
$=(1/t^n)e^{-\sum_i x_i/t}\prod_i P(s_i=1)$
I then take the derivative and set equal to 0:
$[(-n/t^{n+1})e^{-\sum_i x_i/t}+(1/t^n)d/dt(-\sum_i x_i/t)e^{-\sum_i x_i/t}]\prod_i P(s_i=1)=0$
$[(-n/t^{n+1})e^{-\sum_i x_i/t}+(1/t^n)(\sum_i x_i/t^2)e^{-\sum_i x_i/t}]\prod_i P(s_i=1)=0$
$[-nt+\sum_i x_i][e^{-\sum_i x_i/t}/t^{n+2}]\prod_i P(s_i=1)=0$
$-nt+\sum_i x_i=0$
$t=\sum_i x_i/n$
But it looks like the official answer is $t=\sum_i P(s_i=1)x_i/\sum_i P(s_i=1)$
What went wrong?
Thanks