Analytical expression of the log-likelihood of the Binomial model with unknown $n$ and known $y$ and $p$ and its conjugate prior

Question

I'm trying to derive the MLE and Bayesian posterior for $n$ in the Binomial model, $\mathrm{Binomial}(n, p)$ with known $y$ and $p$. The following questions arise

How to derive analytically the negative log-likelihood (and its first-order conditions)?
What is an uninformative prior for $n$ in this case (e.g., for $p$ one can use a $\mathrm{Uniform}(0, 1)$)?
Is there a conjugate prior for $n$?
What if the prior on $n$ is improper, i.e. discrete prior on $\{y_{max}, \mathbb{N}\} \subset \mathbb{N}$? Is there a proper solution?

I tried to look around to found references but I was able to find rather technical papers that do not directly address this (simpler) case.

References found so far:

[1] paper1,
[2] paper2.
[3] paper3: the most informative so far
[4] paper4 addresses the case I'm interested in but does not show a direct log-likelihood maximization

Well, you've already found the Raftery paper, which is probably the canonical Bayesian resource on this topic. Can you help me understand why that paper is not sufficient? — Sycorax, May 25 '21 at 14:52
Well... in short, I would need a 1.5x dumber version to be able to get it :) Also, the MLE case for $p$ known is not directly discussed there. Trying to parse a new ref that I added to the post ((i.e., [4]) — Pietro, May 25 '21 at 15:47
Can you find your answer at https://stats.stackexchange.com/questions/405808/maximum-likelihood-estimator-of-n-when-x-sim-mathrmbinn-p ? — kjetil b halvorsen, May 25 '21 at 15:59
Thanks @kjetilbhalvorsen this answers to question one completely and gives the additional derivation without using the likelihood ratio. I might still be interested in questions 2-4 or pointers to (easier to parse) refs — Pietro, May 25 '21 at 16:58
Here https://stats.stackexchange.com/questions/502124/numbers-of-draws-on-a-modified-bernouilli-process/521043#521043 is a post that can help with some ideas for priors ... — kjetil b halvorsen, May 25 '21 at 18:39
Xi'an, Sycorax, kjetilbhalvorsen - really - thank you very much for your help! That's the first time I've asked a question here and I've been sincerely amazed. Thanks for sharing your knowledge! — Pietro, May 26 '21 at 09:12

Xi'an · Accepted Answer · 2021-05-27T16:33:54.240

I completely concur with Sycorax's comment that Adrian Raftery's 1988 Biometrika paper is the canon on this topic.

How to derive analytically the negative log-likelihood (and its first-order conditions)?

The likelihood is the same whether or not $n$ is unknown: $$L(n|y_1,\ldots,y_I)=\prod_{i=1}^I {n \choose y_i}p^{y_i}(1-p)^{n-y_i} \propto \dfrac{(n!)^I(1-p)^{nI}}{\prod_{i=1}^I(n-y_i)!}$$ and the log-likelihood is the logarithm of the above $$\ell(n|y_1,\ldots,y_I)=C+I\log n!-\sum_{i=1}^I \log (n-y_i)!+nI\log(1-p) $$ Maximum likelihood estimation of $n$ is covered in this earlier answer of mine and by Ben.

What is an uninformative prior for $n$ in this case (e.g., for $p$ one can use a Uniform$(0,1)$)?

Note that the default prior on $p$ is Jeffreys' $\pi(p)\propto 1/\sqrt{p(1-p)}$ rather than the Uniform distribution. In one's answer in the Bernoulli case, kjetil b halvorsen explains why using a Uniform improper prior on $n$ leads to the posterior being decreasing quite slowly (while being proper) and why another improper prior like $\pi(n)=1/n$ or $\pi(n)=1/(n+1)$ has a more appropriate behaviour in the tails. This is connected to the fact that $n$, while being an integer, is a scale parameter in the Bernoulli distribution, in the sense that the random variable $Y\sim\mathcal B(n,p)$ is of order $\mathrm O(n)$. Scale parameters are usually modeled by priors like $\pi(n)=1/n$ (even though I refer you to my earlier answer as to why there is no such thing as a noninformative prior).

Is there a conjugate prior for $n$?

Since the collection of $\mathcal B(n,p)$ distributions is not an exponential family when $n$ varies, since its support depends on $n$, there is no conjugate prior family.

What if the prior on $n$ is improper, i.e. discrete prior on $\{y_\max,\mathbb N\}⊂\mathbb N$? Is there a proper solution?

It depends on the improper prior. The answer by kjetil b halvorsen in the Bernoulli case shows there exist improper priors leading to well-defined posterior distributions. And there also exist improper priors leading to non-defined posterior distributions for all sample sizes $I$. For instance, $\pi(n)\propto\exp\{\exp(n)\}$ should lead to an infinite mass posterior.

Xi'an thanks a lot for sharing your knowledge! I sincerely appreciate your thorough post which answers all my questions. Thank you! — Pietro, May 26 '21 at 09:13
Xi'an I'd like to ask you to expand on the following sentence "_$n$, while being an integer, is a scale parameter in the Bernoulli_ [maybe binomial here?] _distribution_". In other words, how can I see that $n$ is a scale paramter? Is there a way to write it out explicitly. Any pointer to resources would be much appreciated. Thanks a lot in advance for your time — Pietro, May 28 '21 at 10:46
Besides Harold Jeffreys (1939) himself using $\pi(n)\propto 1/n$, the argument is rather low-tech: a Binomial rv $Y$ is the number of times something happens out of $n$ trials. If $n$ is turned into $10n$, $Y$ is on average ten times larger.... — Xi'an, May 28 '21 at 12:50

Analytical expression of the log-likelihood of the Binomial model with unknown $n$ and known $y$ and $p$ and its conjugate prior

1 Answers1