2

I have to maximize $Q(\theta;\theta')$ with respect to $\theta$ at every iteration of my EM algorithm. It boils down to solving these two equations for $\eta$ and $\gamma$ (all the $s_i$s are sufficient statistics):

$$ 0 = -\frac{1}{2}(1 + e^{\eta} - e^{-\eta} - e^{-2\eta} ) + 2e^{-\gamma}\left[s_3(1+e^{-\eta}) + (s_0 - s_1)(1-e^{-\eta})\right], $$ $$ 0 = -\frac{T}{2} + e^{-\gamma}\left\{(e^{\eta}+e^{-\eta} + 2)^{-1})\left[ 2(s_0-s_1-s_3) \right] + \frac{s_1+s_2}{2} + s_3 \right\}. $$

I try to relegate the algebra to the computer, but the following code just hangs for a super long time.

from sympy.solvers import solve
from sympy import Symbol, exp

# constants
T = Symbol('T')
s0 = Symbol('s0')
s1 = Symbol('s1')
s2 = Symbol('s2')
s3 = Symbol('s3')
s4 = Symbol('s4')

#  variables 
eta = Symbol('eta')
gamma = Symbol('gamma')

solve([-T/2+exp(-gamma)*(2*(s0-s1-s3)/(exp(eta)+exp(-eta)+2)+(s1+s2)/2 +s3),
       -(1/2)*(1 + exp(eta)-exp(-eta)-exp(-2*eta)) + 2*exp(-gamma)*(s3*(1+exp(-eta))+(s0-s1)*(1-exp(-eta)))],
        [eta,gamma])

Any advice on getting things to run or solve the system by hand? Solving by hand would do the job, but I would prefer an answer that deals with getting the code to run. I figure I'll be in a similar spot when I try to fit a different model in the future.

Taylor
  • 18,278
  • 2
  • 31
  • 66
  • 1
    It's better to use `(1 + exp(eta)-exp(-eta)-exp(-2*eta))/2` instead of `(1/2)*(1 + exp(eta)-exp(-eta)-exp(-2*eta))`. The former will create a float (or in Python 2, `0`), but SymPy works better if you use exact rationals. – asmeurer Aug 04 '16 at 19:01

1 Answers1

2

I am able to get a solution. However, it's extremely complicated, as it involves the cubic equation. I do not know if it can be simplified.

First, I'm going to set x = exp(eta) and y = exp(gamma). Then eta will equal log(x) and gamma will equal log(y):

In [42]: system = Tuple(-T/2+exp(-gamma)*(2*(s0-s1-s3)/(exp(eta)+exp(-eta)+2)+(s1+s2)/2 +s3),
       -(1 + exp(eta)-exp(-eta)-exp(-2*eta))/2 + 2*exp(-gamma)*(s3*(1+exp(-eta))+(s0-s1)*(1-exp(-eta))))

In [43]: system = system.subs({exp(eta): x, exp(gamma): y})

Now, unfortunately, the systems solver is still too slow with this, so we have to solve it by hand. Fortunately, this is easy, because the second equation is linear in y. I am using cancel to cancel common factors, and as_numer_denom()[0] to only grab the numerator (which we can do since the equations equal 0).

In [44]: a = system[0].cancel().as_numer_denom()[0]

In [45]: b = system[1].cancel().as_numer_denom()[0]

In [46]: solve(b, y)
Out[46]:
⎡4⋅x⋅(s₀⋅x - s₀ - s₁⋅x + s₁ + s₃⋅x + s₃)⎤
⎢───────────────────────────────────────⎥
⎢             3    2                    ⎥
⎣            x  + x  - x - 1            ⎦

Now we substitute that in for y in a. The result is a cubic in x

In [51]: a.subs(y, solve(b, y)[0]).cancel().as_numer_denom()[0]
Out[51]:
          2                      2                      2                    2                3         2                     3       2
- 4⋅T⋅s₀⋅x  + 4⋅T⋅s₀⋅x + 4⋅T⋅s₁⋅x  - 4⋅T⋅s₁⋅x - 4⋅T⋅s₃⋅x  - 4⋅T⋅s₃⋅x + 4⋅s₀⋅x  - 4⋅s₀⋅x + s₁⋅x  - 3⋅s₁⋅x  + 3⋅s₁⋅x - s₁ + s₂⋅x  + s₂⋅x  - s₂⋅x - s₂ + 2⋅s₃⋅x

3         2
  - 2⋅s₃⋅x  + 2⋅s₃⋅x - 2⋅s₃

In [52]: solve(a.subs(y, solve(b, y)[0]).cancel().as_numer_denom()[0], x);

I have hidden the output of In [52] because it's huge. I do not know if they can be simplified. To get y, you would substitute it in for solve(b, y)[0], and to get eta and gamma you would take logarithms.

I do not know if these symbolic solutions will end up being useful, since they are so complicated. Furthermore, the cubic equation has an annoying property that you cannot represent it without using complex numbers (even for the real root), so that may become an issue.

You can try working with this (I'm curious if it ends up working for you), but you might also consider just using a numeric solver.

asmeurer
  • 136
  • 3
  • Thanks @asmeurer. I'll check this out once I get back from vacation – Taylor Aug 05 '16 at 01:45
  • Ok got back today, and it runs after I threw in a `from sympy.abc import x, y`. Looks good. Breaking everything down definitely speeds everything up. – Taylor Aug 15 '16 at 21:35
  • I'm not worried about the cubic thing either. I'm reproducing an example of the same model with non-transformed parameters. They end up having a cubic too, and they seem to be exercising some discretion regarding which root to go with. – Taylor Aug 15 '16 at 21:36