directly calculate the MLE of marginal distribution of the observable variable in EM algorithm

Question

One thing I cannot understand for the EM (Expectation maximization) algorithm, for the observable variable $Y$ and latent variable $Z,$ why don't we directly take MLE (Maximal likelihood estimation) of $Y?$ Since $Y$'s marginal distribution is independent on $Z,$ then we are no need to know the observation of $Z.$

Is the marginal distribution of $Y:f(y|\Theta)$ not easy to calculated? At least for the given examples in the book, we know their marginal distributions.

for example:

latent variable: $Z\sim Bernoulli(\pi),$

observably variable: $(Y|Z=1)\sim Bernoulli(p),\ (Y|Z=0)\sim Bernoulli(q),$

then we know $f(y|\pi,p,q) = \pi p^y(1-p)^{1-y} + (1-\pi)q^y(1-q)^{1-y}.$
latent variable: $P(Z=i) = p_i,$

observably variable: $(Y|Z=i)\sim n(\mu_i,\sigma^2_i),$ PDF noted as $n(y|\mu_i,\sigma^2_i).$

then we know $f(y|p_1\cdots p_n,\mu_1,\cdots\mu_n,\sigma^2_1\cdots,\sigma^2_n) = \sum\limits_ip_i\cdot n(y|\mu_i,\sigma^2_i).$

the marginal distributions of $Y$ are all known.

do you mean when the density of is tractable such as latent variable is Bernoulli (see my update), EM is meanless? — user6703592, Mar 02 '20 at 02:50

score 1 · Accepted Answer · answered Mar 02 '20 at 09:12

1

The fact that the likelihood can be numerically computed in a manageable $\text O(n)$ time does not imply that the derivation of the associated maximum likelihood is feasible by analytical or standard numerical means. For instance, this representation of the (log-)likelihood surface of a two parameter mixture model exhibits the presence of at least three local modes, likely to confuse numerical methods. EM provides at least the guarantee that the endpoint of the sequence is a local mode.

answered Mar 02 '20 at 09:12

Xi'an

90,397
9
157
575

is it same story here https://stats.stackexchange.com/questions/94559/why-is-optimizing-a-mixture-of-gaussian-directly-computationally-hard – user6703592 Mar 02 '20 at 10:50
@user6703592: yes indeed, see [my more detailed answer there.](https://stats.stackexchange.com/a/129858/7224). – Xi'an Mar 08 '20 at 17:03

directly calculate the MLE of marginal distribution of the observable variable in EM algorithm

1 Answers1