The most common usage of the variational inference looks like to be in computing the marginal distribution $P(X)$ in the denominator of the Bayes formula when computing the posterior probability of the hidden variables, $P(Z|X)$. This is likely a dumb question but I don't understand why we need to compute $P(Z|X)$. Why don't we just optimize $P(Z|X)$ instead (find $Z$ that maximizes $P(Z|X)$), in which case we don't need to exactly compute $P(X)$ since it does not depend on the value of $Z$. In that alternative way, we can just optimize the numerator $P(X|Z)P(Z)$ as in the regular MAP (maximum aposteriori) estimation of the parameters ($\theta$) when we don't have any hidden variables $Z$. What is the difference in those two problems (estimating parameters vs. hidden variables) such that one can go with optimizing through MAP while the other needs sophisticated tools for computing (or approximating) $P(Z|X)$?
Edit: After a little more search, I learned that the MAP is not really full Bayesian, since it does not learn the actual posterior distribution, but it instead gets a point estimate of the parameters through maximizing posterior probability. Then I think what variational inference does is real Bayesian. But still, I am not sure whether I understand the need for getting the distribution itself. Is it to get a better sense of the parameter space?