4

I'm trying to derive the MLE and Bayesian posterior for $n$ in the Binomial model, $\mathrm{Binomial}(n, p)$ with known $y$ and $p$. The following questions arise

  1. How to derive analytically the negative log-likelihood (and its first-order conditions)?

  2. What is an uninformative prior for $n$ in this case (e.g., for $p$ one can use a $\mathrm{Uniform}(0, 1)$)?

  3. Is there a conjugate prior for $n$?

  4. What if the prior on $n$ is improper, i.e. discrete prior on $\{y_{max}, \mathbb{N}\} \subset \mathbb{N}$? Is there a proper solution?

I tried to look around to found references but I was able to find rather technical papers that do not directly address this (simpler) case.

References found so far:

  • [1] paper1,

  • [2] paper2.

  • [3] paper3: the most informative so far

  • [4] paper4 addresses the case I'm interested in but does not show a direct log-likelihood maximization

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Pietro
  • 163
  • 5
  • 1
    Well, you've already found the Raftery paper, which is probably the canonical Bayesian resource on this topic. Can you help me understand why that paper is not sufficient? – Sycorax May 25 '21 at 14:52
  • Well... in short, I would need a 1.5x dumber version to be able to get it :) Also, the MLE case for $p$ known is not directly discussed there. Trying to parse a new ref that I added to the post ((i.e., [4]) – Pietro May 25 '21 at 15:47
  • 3
    Can you find your answer at https://stats.stackexchange.com/questions/405808/maximum-likelihood-estimator-of-n-when-x-sim-mathrmbinn-p ? – kjetil b halvorsen May 25 '21 at 15:59
  • Thanks @kjetilbhalvorsen this answers to question one completely and gives the additional derivation without using the likelihood ratio. I might still be interested in questions 2-4 or pointers to (easier to parse) refs – Pietro May 25 '21 at 16:58
  • 3
    Here https://stats.stackexchange.com/questions/502124/numbers-of-draws-on-a-modified-bernouilli-process/521043#521043 is a post that can help with some ideas for priors ... – kjetil b halvorsen May 25 '21 at 18:39
  • 2
    Xi'an, Sycorax, kjetilbhalvorsen - really - thank you very much for your help! That's the first time I've asked a question here and I've been sincerely amazed. Thanks for sharing your knowledge! – Pietro May 26 '21 at 09:12

1 Answers1

6

I completely concur with Sycorax's comment that Adrian Raftery's 1988 Biometrika paper is the canon on this topic.

  1. How to derive analytically the negative log-likelihood (and its first-order conditions)?

The likelihood is the same whether or not $n$ is unknown: $$L(n|y_1,\ldots,y_I)=\prod_{i=1}^I {n \choose y_i}p^{y_i}(1-p)^{n-y_i} \propto \dfrac{(n!)^I(1-p)^{nI}}{\prod_{i=1}^I(n-y_i)!}$$ and the log-likelihood is the logarithm of the above $$\ell(n|y_1,\ldots,y_I)=C+I\log n!-\sum_{i=1}^I \log (n-y_i)!+nI\log(1-p) $$ Maximum likelihood estimation of $n$ is covered in this earlier answer of mine and by Ben.

  1. What is an uninformative prior for $n$ in this case (e.g., for $p$ one can use a Uniform$(0,1)$)?

Note that the default prior on $p$ is Jeffreys' $\pi(p)\propto 1/\sqrt{p(1-p)}$ rather than the Uniform distribution. In one's answer in the Bernoulli case, kjetil b halvorsen explains why using a Uniform improper prior on $n$ leads to the posterior being decreasing quite slowly (while being proper) and why another improper prior like $\pi(n)=1/n$ or $\pi(n)=1/(n+1)$ has a more appropriate behaviour in the tails. This is connected to the fact that $n$, while being an integer, is a scale parameter in the Bernoulli distribution, in the sense that the random variable $Y\sim\mathcal B(n,p)$ is of order $\mathrm O(n)$. Scale parameters are usually modeled by priors like $\pi(n)=1/n$ (even though I refer you to my earlier answer as to why there is no such thing as a noninformative prior).

  1. Is there a conjugate prior for $n$?

Since the collection of $\mathcal B(n,p)$ distributions is not an exponential family when $n$ varies, since its support depends on $n$, there is no conjugate prior family.

  1. What if the prior on $n$ is improper, i.e. discrete prior on $\{y_\max,\mathbb N\}⊂\mathbb N$? Is there a proper solution?

It depends on the improper prior. The answer by kjetil b halvorsen in the Bernoulli case shows there exist improper priors leading to well-defined posterior distributions. And there also exist improper priors leading to non-defined posterior distributions for all sample sizes $I$. For instance, $\pi(n)\propto\exp\{\exp(n)\}$ should lead to an infinite mass posterior.

Xi'an
  • 90,397
  • 9
  • 157
  • 575
  • 3
    Xi'an thanks a lot for sharing your knowledge! I sincerely appreciate your thorough post which answers all my questions. Thank you! – Pietro May 26 '21 at 09:13
  • Xi'an I'd like to ask you to expand on the following sentence "_$n$, while being an integer, is a scale parameter in the Bernoulli_ [maybe binomial here?] _distribution_". In other words, how can I see that $n$ is a scale paramter? Is there a way to write it out explicitly. Any pointer to resources would be much appreciated. Thanks a lot in advance for your time – Pietro May 28 '21 at 10:46
  • 1
    Besides Harold Jeffreys (1939) himself using $\pi(n)\propto 1/n$, the argument is rather low-tech: a Binomial rv $Y$ is the number of times something happens out of $n$ trials. If $n$ is turned into $10n$, $Y$ is on average ten times larger.... – Xi'an May 28 '21 at 12:50