4

In logistic regression, we set the probability of predicting a target $y$ given a data $x$ as,

$\Pr(Y = 1|X;w) = \dfrac{\exp(w^TX)}{(1+\exp(w^TX))}$

What is exactly this probability distribution (or more accurately, conditional probability mass function) called?

I tried to look up logistic distribution but it doesn't look the same https://en.wikipedia.org/wiki/Logistic_distribution

  • 1
    The response variable y is a Bernoulli variable. You are predicting the probability of belonging to one of two particular classes. https://en.wikipedia.org/wiki/Logistic_regression – M Waz Sep 30 '19 at 22:30
  • You might find [this answer](https://stats.stackexchange.com/a/112058/7071) helpful. It gives the latent utility derivation for binary choice model, and shows where the logistic CDF comes from in modeling Bernoulli variables. – dimitriy Sep 30 '19 at 22:41

2 Answers2

5

Since $Y_i$ is a binary variable, its distribution is the Bernoulli distribution:

$$Y_i | \mathbf{x}, \mathbf{w} \sim \text{Bern} \Bigg( \text{Prob} = \frac{\exp(\mathbf{w}^\text{T} \mathbf{x})}{1 + \exp(\mathbf{w}^\text{T} \mathbf{x})} \Bigg).$$

One alternative way of looking at the logistic regression is to regard the observed response variable as a discretisation of an underlying "latent variable", where the latter has a logistic distribution. In this (equivalent) alternative formulation, we have an observed response variable $Y_i \equiv \mathbb{I}(\tilde{Y}_i > 0)$, with the underlying latent response having the distribution:

$$\tilde{Y}_i | \mathbf{x}, \mathbf{w} \sim \text{Logistic} \Bigg( \text{Location} = \mathbf{w}^\text{T} \mathbf{x}, \ \text{Scale} = 1 \Bigg).$$

Ben
  • 91,027
  • 3
  • 150
  • 376
  • Thank you. Just to clarify. This is the conditional distribution of $Y$ (conditioned on $X$) and not $Y$ itself. Is there a way that we can even talk about the distribution of $Y$ itself (since the distribution of $X$ is usually unknown, as $X$ could simply be some vectorized image). – Curaçao Hajek Oct 01 '19 at 05:39
  • 2
    Yes, in regression analysis the distributions are all conditional on $\mathbf{x}$. The only way to get the marginal distribution of the response variable is to specify a distribution for $\mathbf{x}$, and then apply the law of total probability. Note that this takes you out of regression analysis and into the territory of multivariate analysis. – Ben Oct 01 '19 at 05:58
  • ... And even in the latter case, $Y$ is still binary, so its marginal distribution would still be a Bernoulli distribution (albeit with a different probability parameter). – Ben Aug 27 '21 at 06:00
2

In your link, you have the cumulative distribution function for the logistic distribution as $$\frac{1}{1+e^{-\frac{x-\mu}{s}}}$$

while in your question you have $$\dfrac{\exp(w^TX)}{(1+\exp(w^TX))} \text{ which is } \dfrac{1}{1+e^{-w^TX}}$$

and these are essentially the same so long as $w^TX$ has mean $0$ and variance $1$.

You can see that the first expression is a cumulative distribution function and this expression approaches $0$ when $x$ is very negative but approaches $1$ when $x$ is large and positive. This is what want from your logistic regression: the predicted conditional probability of a positive result $(Y=1)$ increases towards $1$ as $w^TX$ increases

Henry
  • 30,848
  • 1
  • 63
  • 107