If ϵ is uniformly distributed, then a linear probability model is appropriate? Can I find any Literature?

Question

A latent variable model involving a binomial observed variable $Y$ can be constructed such that $Y$ is related to the latent variable $Y^*$ via

$ Y = \begin{cases} 0, & \mbox{if }Y^*>0 \\ 1, & \mbox{if }Y^*<0. \end{cases} $

The latent variable $Y^*$ is then related to a set of regression variables $X$ by the model $Y^* = X\beta + \varepsilon$. This results in a binomial regression model.

The variance of $\varepsilon$ can not be identified and when it is not of interest is often assumed to be equal to one. If $\varepsilon$ is normally distributed, then a probit is the appropriate model and if $\varepsilon$ is log-Weibull distributed, then a logit is appropriate. If $\varepsilon$ is uniformly distributed, then a linear probability model is appropriate.

If the $\varepsilon$'s have a log-Weibull distribution a complementary log-log link is appropriate; if they have a logistic distribution a logit link is appropriate. — Scortchi - Reinstate Monica, Jan 10 '14 at 11:46

score 6 · Answer 1 · edited Jan 10 '14 at 11:15

Let's try to validate the claim that if the error term of the underlying latent variable model is assumed uniformly distributed, then a Linear Probability model is appropriate.

The underlying latent variable model is (assuming a simple regression setting for simplicity - it doesn't change anything)

$$Y^* = b_0+ b_1X + \epsilon,\;\; \epsilon\mid X\sim U(-a,a)$$

where the limits for $U$ are chosen so that the error term has a zero expected value, conditional on the regressors. The cumulative distribution function here is $F_{\epsilon|X}(\epsilon\mid X) = \frac {\epsilon + a}{2a}$

and the observed model is (given how $Y$ is in the specific question defined as a function of $Y^*$)

$$P(Y =1\mid X ) = P(Y^*<0\mid X) = P(b_0+ b_1X + \epsilon<0\mid X) = P(\epsilon <- b_0- b_1X\mid X)$$ $$=F_{\epsilon|X}(- b_0- b_1X\mid X) = \frac {- b_0- b_1X + a}{2a} = \frac {- b_0+a}{2a}+\frac {- b_1}{2a}X$$

$$\Rightarrow P(Y =1\mid X )= \beta_0 + \beta_1X$$

which is the Linear Probability model with the mapping

$$\beta_0 = \frac {- b_0+a}{2a},\;\; \beta_1=\frac{- b_1}{2a}$$

If ϵ is uniformly distributed, then a linear probability model is appropriate? Can I find any Literature?

1 Answers1

Linked