Likelihood function of a Linear probability model

Question

What is the Likelihood function of a linear probability model?

I know the likelihood function is the joint probability density, but how to construct the likelihood function when we only have the probability $P(Y_i=1|X_i)$ and $P(Y_i=0|X_i)$?

If the likelihood is the joint density of $X_i$ and $Y_i$, then you need to know $P(X_i)$ in order to get the joint density. Whether $P(X_i)$ is meaningful or not is something you need to consider. — Dilip Sarwate, Dec 14 '11 at 12:43
The linear probability model is just linear regression with a binary outcome variable. — Charlie, Dec 14 '11 at 15:48

score 6 · Answer 1 · edited Apr 13 '17 at 12:44

In your example you simply don't have a likelihood function, because you defined just a probability model rather than a statistical one. If you know the probabilities $$ P(Y_i = 1 | X_i) = p_i \quad \mbox{and} \quad P(Y_i =0| X_i) = 1-p_i, $$ then you have a "unique" conditional probability model. First you have to understand the differences between probability and statistical models. Please, see this post.

In order to have a statistical model, the above probabilities must be unknown. Typically, some relation is imposed: $$ P_\theta(Y_i = 1 | X_i=x_i) = p_i(\theta),$$ where $\theta$ is the unknown parameter vector and $p_i(\theta) \in [0,1]$ for $i=1, \ldots, n$. Now, we have a parametric statistical model, since for each $\theta$ we have a probability model.

The likelihood function is $$ L(\theta) = \prod_{i=1}^n P_\theta(Y_i = y_i | X_i=x_i) = \prod_{i=1}^n p_i(\theta)^{y_i}(1-p_i(\theta))^{1-y_i}. $$

Usually, the shape of $p_i(\theta)$ is commonly specified as:

$p_i(\theta) = \frac{\exp(\eta_i(\theta))}{1 + \exp(\eta_i(\theta))},$
$p_i(\theta) = F(\eta_i(\theta)),$

where $\eta_i(\theta) = \alpha + \beta x_i$ and $F$ is a cumulative distribution function. You can find a good shape for $p_i$ by looking at the data behavior (plots, data dispersion and so forth).

+1. Note that by definition the [Linear Probability Model](https://en.wikipedia.org/wiki/Linear_probability_model) (LPM) is case (2) with $F$ the *identity* function on the interval $[0,1]$ (and usually taken to be undefined on its complement). Alternatively, rather than making it undefined, one could use the CDF for a Uniform$(0,1)$ variable, in which case (2) includes the LPM. — whuber, Oct 07 '18 at 14:46

Elias · Answer 2 · 2018-10-07T14:34:20.223

The linear probability model (LPM) is $$\mathbb{P}(Y_i=1|X_i)=\beta_0+\beta_1X_i$$ with $Y_i$ a dummy random variable, and $X_i$ a random variable. If $X_i$ is instead a vector of random variables, the LPM is $$\mathbb{P}(Y_i=1|X_i)=\beta_0+X'_i\beta.$$ But let me continue with the simple case where $X_i$ is a random variable. It follows that $$\mathbb{P}(Y_i=0|X_i)=1-\mathbb{P}(Y_i=1|X_i)=1-\beta_0-\beta_1X_i.$$ Thus, for $y_i\in\{0,1\}$,$$\begin{align*}\mathbb{P}(Y_i=y_i|X_i)&=\mathbb{P}(Y_i=1|X_i)^{y_i}\mathbb{P}(Y_i=0|X_i)^{1-y_i}\\&=(\beta_0+\beta_1X_i)^{y_i}(1-\beta_0-\beta_1X_i)^{1-y_i}.\end{align*}$$ Now, if $\{(Y_i, X_i)\}_{i=1}^n$ is a random sample of $n\geq 1$ observations, the likelihood function is $$\begin{align*}\mathbb{P}(Y_1=y_1,Y_2=y_2,\ldots,Y_n=y_n|X_1,X_2,\ldots,X_n)&=\prod_{i=1}^n\mathbb{P}(Y_i=y_i|X_i)\\&=\prod_{i=1}^n(\beta_0+\beta_1X_i)^{y_i}(1-\beta_0-\beta_1X_i)^{1-y_i}.\end{align*}$$ Note that since probabilities are in $[0,1]$ we assume that $\beta_0+\beta_1 X_i$ is in $[0,1]$ for $i=1,\ldots,n$. In practice difficulties may occur when fitting the LPM because during the fitting process, fitted values may fall outside $[0,1]$ for some observed $X_i=x_i$.

This likelihood function did not include the needed constraints on the parameters to require probabilities to be in [0,1]. — Frank Harrell, Oct 07 '18 at 11:50

Likelihood function of a Linear probability model

2 Answers2