Why do some formulas have the coefficient in the front in logistic regression likelihood, and some don't?

Question

I am deriving logistic regression's likelihood. I have seen two different versions:

$$\begin{equation} f(y|\beta)={\displaystyle \prod_{i=1}^{N} \frac{n_i} {y_i!(n_i-y_i)!}} \pi_{i}^{y_i}(1-\pi_i)^{n_i - y_i} \tag 1 \end{equation}$$

Or this

$$\begin{equation} L(\beta_0,\beta_1)= \displaystyle \prod_{i=1}^{N}p(x_i)^{y_i}(1-p(x_i))^{1-y_i} \tag 2 \end{equation}$$

Why is there $\frac{n_i} {y_i!(n_i-y_i)!}$ in equation 1?

Sources:

First: https://czep.net/stat/mlelr.pdf (page 3 equ. 2)
Second: http://www.stat.cmu.edu/~cshalizi/uADA/12/lectures/ch12.pdf (page 5 equ. 12.6)

Note: This question is not a duplicate of What does "likelihood is only defined up to a multiplicative constant of proportionality" mean in practice? One can trace the answer back to binomial distribution, after seeing how it is done. But nobody would have known the question in that post is the answer to this question.

That factor should be there, but if you are looking for the $\beta$ that maximises this function then, as the factor does not depend on $\beta$ it will not have an influence on the $\beta$ where you have the maximum. By the way, you lost the $\Pi$ in the second formula. — , Sep 08 '17 at 16:07
Even after seeing the note (and, digging deeper, seeing the close and reopen), I too would have said "likelihood functions are defined up to proportionality" was the answer to this question. Here, it does not matter whether you know the order of the observations or not, as they lead to proportional likelihood functions — Henry, Sep 09 '17 at 01:24

score 11 · Accepted Answer · answered Sep 08 '17 at 17:57

The second is a special case of the first. Your first reference discusses the case where each $y_i$ is distributed as a Binomial distribution with sample size $n_i$, while the second reference assumes each $y_i$ is a Bernoulli random variable. That is the difference: when each $n_i = 1$, $\frac{n_i} {y_i!(n_i-y_i)!} = 1$.

Some quotes supporting this: from 2.1.2 in the first reference:

Since the probability of success for any one of the $n_i$ trials is $\pi_i$...

And from the first section in the second reference 12.1:

Let's pick one of the classes and call it "$1$" and the other "$0$"...

Why do some formulas have the coefficient in the front in logistic regression likelihood, and some don't?

1 Answers1