What are the possible estimates of the parameters of the multinomial distribution?

Question

The expected value of the parameters of the multinomial distribution (taking into account the Dirichlet prior $D(\alpha)$ and the posterior Dirichlet-Multinomial) is:

$\pi_i = α_i+ x_i / \sum_{j} α_j+ x_j$

what are the other statistics and formulas of the multinomial distribution?

In both cases you use priors. MAP = mode of posterior distribution. — Tim, Feb 21 '19 at 18:58
MAP is by definition mode of posterior distribution. "There will be difference" from what? — Tim, Feb 21 '19 at 19:27
If I may suggest something, given your two questions, you'd probably gain more if you learned more/repeated the basic probability theory & statistics before studying more detailed material. Going directly to the complicated stuff may be very hard without understanding the basics. — Tim, Feb 21 '19 at 19:40
you cannot compare the distribution with it's mode. It's like you asked what is the difference between age distribution in humans and the average height. — Tim, Feb 21 '19 at 19:43
@Tim thanks, actually I did not form the question properly, sorry for that. I meant that πi = αi+xi / ∑αi+xi which is the formula of the estimated parameters of the multinomial distribution, is this formula estimated using MAP or using another estimation method? — Mosab Shaheen, Feb 21 '19 at 19:52
This is *not* formula for estimating parameters, but for estimating mean of the posterior distribution. The distribution is estimated by applying Bayes theorem. MAP means estimating just the mode of the distribution. — Tim, Feb 21 '19 at 20:24
@Tim thanks again. I think that the "mean of the posterior distribution" is the proper value for the parameters of the multinomial distribution, that is why I said "formula of the estimated parameters" please correct me if I am wrong? — Mosab Shaheen, Feb 21 '19 at 20:29
No, mean of the distribution and parameters of the distribution are two different things (but in some cases coincide). Sorry, but comments are not meant for introducing basic concepts of probability & statistics. You should really start with introductory handbook, otherwise it doesn't make sense. — Tim, Feb 21 '19 at 20:30
That's absolutely right, but what I am saying is that the mean of the posterior dirichlet (in the case of multinomial distribution with unknown parameters) is used as the proper value of the parameters of the multinomial (because you once told me in my previous question that the parameters of the multinomial are unknown), is it right? — Mosab Shaheen, Feb 21 '19 at 20:35
No, because there is no "proper" value. The parameters are random variables, they don't have single "value" and the posterior predictive distribution is Dirichlet-multinomial. If you want point estimate, you can take mean, median, mode etc. of the distribution, depending on what you need. — Tim, Feb 21 '19 at 20:40
By the way I changed my question to be more accurate and to reflect what I mean. — Mosab Shaheen, Feb 21 '19 at 20:50

Tim · Accepted Answer · 2019-02-21T21:48:15.613

As said in previous answer to your question, in Dirichlet-multinomial model we assume Dirichlet prior for $\pi_1, \pi_2, \dots, \pi_k$ parameters of multinomial distribution, what leads to the following model

$$\begin{align} (x_1, x_2, \dots, x_k) &\sim \mathcal{M}(n, \, \pi_1, \pi_2, \dots, \pi_k) \\ (\pi_1, \pi_2, \dots, \pi_k) &\sim \mathcal{D}(\alpha_1, \alpha_2, \dots, \alpha_k) \end{align}$$

We estimate the parameters by applying Bayes theorem, and because the two Dirichlet is a conjugate prior for multinomial, we have closed formula for estimating the posterior distribution of $\pi_1, \pi_2, \dots, \pi_k$, that is Dirichlet with posterior parameters $\alpha_1+x_1, \alpha_2+x_2, \dots, \alpha_k+x_k$. If you want to get point estimates for the $\pi_1, \pi_2, \dots, \pi_k$ parameters, you can take mean of the distribution

$$ E(\pi_i) = \frac{\alpha_i + x_i}{\sum_{j=1}^k \alpha_j + x_j} $$

but you could as well look at other statistics of the distribution, like mode

$$ \mathrm{Mode}(\pi_i) = \frac{\alpha_i + x_i - 1}{\sum_{j=1}^k (\alpha_j + x_j -1)} $$

Notice that mode is defined for $\alpha_i > 1$, since otherwise mass of the distribution does not accumulate around single peak, as you can see on examples in this answer. Mode of posterior distribution is also known under the name of maximum a posteriori (MAP) estimate.

As a sidenote: The above formulas are very simple, but in many cases (not here) estimating the full posterior distribution is a hard problem. Often finding point estimate using MAP is much easier, since you can find MAP by optimization, rather then MCMC simulation for the full distribution. What follows, sometimes people directly estimate the mode (point estimate), without finding the full distribution.

You can be interested in other statistics of the posterior distribution as well (median, quantiles, etc.), depending on your needs.

@MosabShaheen As I said you should **really** start with learning basic concepts like estimator, or mode, before going to advanced topics. Otherwise this won't work. It is as if you went to university, to study literature, but didn't know how to read and write. There is infinitely many estimators. MAP is also a point estimator. Mode is the highest point of probability density function, but this goes beyond the question you asked. Also please notice that comments are **not** meant to asking follow up questions (ask new ones if you need) or lengthy discussions. — Tim, Feb 22 '19 at 10:30
@MosabShaheen You asked "is expected value estimated using MAP?" So I did answer your question: expected value is not estimated by MAP, since MAP is mode of the posterior Dirichlet distribution. If you have follow up questions, like "how the formula for mode of Dirichlet distribution was derived", then ask a new question. At this point I will stop this discussion, since comments are really not meant for lengthy discussions. — Tim, Feb 22 '19 at 10:55

What are the possible estimates of the parameters of the multinomial distribution?

1 Answers1