1

What is the Likelihood function of a linear probability model?

I know the likelihood function is the joint probability density, but how to construct the likelihood function when we only have the probability $P(Y_i=1|X_i)$ and $P(Y_i=0|X_i)$?

Rein
  • 165
  • 6
  • 1
    If the likelihood is the joint density of $X_i$ and $Y_i$, then you need to know $P(X_i)$ in order to get the joint density. Whether $P(X_i)$ is meaningful or not is something you need to consider. – Dilip Sarwate Dec 14 '11 at 12:43
  • 1
    What is a linear probability model? – Xi'an Dec 14 '11 at 14:21
  • The linear probability model is just linear regression with a binary outcome variable. – Charlie Dec 14 '11 at 15:48

2 Answers2

6

In your example you simply don't have a likelihood function, because you defined just a probability model rather than a statistical one. If you know the probabilities $$ P(Y_i = 1 | X_i) = p_i \quad \mbox{and} \quad P(Y_i =0| X_i) = 1-p_i, $$ then you have a "unique" conditional probability model. First you have to understand the differences between probability and statistical models. Please, see this post.

In order to have a statistical model, the above probabilities must be unknown. Typically, some relation is imposed: $$ P_\theta(Y_i = 1 | X_i=x_i) = p_i(\theta),$$ where $\theta$ is the unknown parameter vector and $p_i(\theta) \in [0,1]$ for $i=1, \ldots, n$. Now, we have a parametric statistical model, since for each $\theta$ we have a probability model.

The likelihood function is $$ L(\theta) = \prod_{i=1}^n P_\theta(Y_i = y_i | X_i=x_i) = \prod_{i=1}^n p_i(\theta)^{y_i}(1-p_i(\theta))^{1-y_i}. $$

Usually, the shape of $p_i(\theta)$ is commonly specified as:

  1. $p_i(\theta) = \frac{\exp(\eta_i(\theta))}{1 + \exp(\eta_i(\theta))},$
  2. $p_i(\theta) = F(\eta_i(\theta)),$

where $\eta_i(\theta) = \alpha + \beta x_i$ and $F$ is a cumulative distribution function. You can find a good shape for $p_i$ by looking at the data behavior (plots, data dispersion and so forth).

  • 2
    +1. Note that by definition the [Linear Probability Model](https://en.wikipedia.org/wiki/Linear_probability_model) (LPM) is case (2) with $F$ the *identity* function on the interval $[0,1]$ (and usually taken to be undefined on its complement). Alternatively, rather than making it undefined, one could use the CDF for a Uniform$(0,1)$ variable, in which case (2) includes the LPM. – whuber Oct 07 '18 at 14:46
2

The linear probability model (LPM) is $$\mathbb{P}(Y_i=1|X_i)=\beta_0+\beta_1X_i$$ with $Y_i$ a dummy random variable, and $X_i$ a random variable. If $X_i$ is instead a vector of random variables, the LPM is $$\mathbb{P}(Y_i=1|X_i)=\beta_0+X'_i\beta.$$ But let me continue with the simple case where $X_i$ is a random variable. It follows that $$\mathbb{P}(Y_i=0|X_i)=1-\mathbb{P}(Y_i=1|X_i)=1-\beta_0-\beta_1X_i.$$ Thus, for $y_i\in\{0,1\}$,$$\begin{align*}\mathbb{P}(Y_i=y_i|X_i)&=\mathbb{P}(Y_i=1|X_i)^{y_i}\mathbb{P}(Y_i=0|X_i)^{1-y_i}\\&=(\beta_0+\beta_1X_i)^{y_i}(1-\beta_0-\beta_1X_i)^{1-y_i}.\end{align*}$$ Now, if $\{(Y_i, X_i)\}_{i=1}^n$ is a random sample of $n\geq 1$ observations, the likelihood function is $$\begin{align*}\mathbb{P}(Y_1=y_1,Y_2=y_2,\ldots,Y_n=y_n|X_1,X_2,\ldots,X_n)&=\prod_{i=1}^n\mathbb{P}(Y_i=y_i|X_i)\\&=\prod_{i=1}^n(\beta_0+\beta_1X_i)^{y_i}(1-\beta_0-\beta_1X_i)^{1-y_i}.\end{align*}$$ Note that since probabilities are in $[0,1]$ we assume that $\beta_0+\beta_1 X_i$ is in $[0,1]$ for $i=1,\ldots,n$. In practice difficulties may occur when fitting the LPM because during the fitting process, fitted values may fall outside $[0,1]$ for some observed $X_i=x_i$.

Elias
  • 856
  • 1
  • 6
  • 24
  • This likelihood function did not include the needed constraints on the parameters to require probabilities to be in [0,1]. – Frank Harrell Oct 07 '18 at 11:50