2

I know the equation of the sigmoid function and use it in logistic regression, SVM, etc.

$$ S(x) = \frac{1}{1 + e^{-x}} $$

In the case of the sigmoid function, What is the exact input and output of this function? What I know is, it takes the value of x which is written in the above equation. The value of x is the prediction and gives output in a probability value. The probability of a point belongs to a specific class.

Could anyone give me a better explanation of sigmoid function?

F.C. Akhi
  • 625
  • 2
  • 13

2 Answers2

10

Your description is correct. The proper name of the function is logistic function, as "sigmoid" is ambiguous and may be applied to different S-shaped functions. It takes as input some value $x$ on real line $x \in (-\infty, \infty)$ and transforms it to the value in the unit interval $S(x) \in (0, 1)$. It is commonly used to transform the outputs of the models (logistic regression, neural networks) to probabilities, because probabilities are also bounded to unit interval. It is not the only function that does this, there are many others like probit, or cloglog, but logistic function is the most popular one.

Tim
  • 108,699
  • 20
  • 212
  • 390
  • 2
    Sigmoid means S-shaped (from the Greek letter sigma, equivalent to s in many other languages) -- with the warning or understanding here that the S is stretched into a one-to-one function that is monotone, bounded in result and steepest for argument $0$. Regarding the logistic function as **the** sigmoid function is from some points of view a stretch or even abuse of terminology, but it's so common in various areas that complaint is futile. – Nick Cox Jan 26 '22 at 09:21
  • 1
    @NickCox good point about naming, updated for that. – Tim Jan 26 '22 at 09:26
  • @Tim what you mentioned about input x is actually the x from y = mx + c? or Its value of y? And it's transforming the value of the prediction which is y to some probability value. If so how does it represents the probability of a point belonging to a specific class? It seems like it is doing two jobs. 1. It is giving prediction output in probability value. 2. It is giving a decision that point x falls which class. Am I right? – F.C. Akhi Jan 26 '22 at 09:33
  • @F.C.Akhi $y = S(x)$, for logistic regression $y = S(mx + c)$, so you transform something that was predicted by the model to probability for $y$. It represents probability in the sense that it is bounded in $(0, 1)$ as probabilities do, nothing more. It doesn't do two things, [it just returns probabilities](https://stats.stackexchange.com/questions/127042/why-isnt-logistic-regression-called-logistic-classification/127044#127044), it is up to you how to use probabilities for making classifications. – Tim Jan 26 '22 at 10:14
  • @Tim how did you edit my equation? Is it a single $ sign? – F.C. Akhi Jan 26 '22 at 11:22
  • 1
    @F.C.Akhi click "edit" under your question to see the raw markdown `$` is used to enclose inline formula, `$$` to enclose the block one. – Tim Jan 26 '22 at 11:38
  • @F.C.Akhi FYI, this site uses MathJax (which is very similar to $\LaTeX$), just like the Mathematics site (and some other SE sites) does. This is what the `$` and `$$` that Tim mentions are for. Two Math meta posts you may find helpful for more info are [Short and helpful advice on using MathJax on the site ...](https://math.meta.stackexchange.com/q/33179/602049) and [MathJax basic tutorial and quick reference](https://math.meta.stackexchange.com/q/5020/602049). – John Omielan Jan 26 '22 at 20:22
0

If you have a probability $p$ with $0<p<1$ then the odds are $\frac{p}{1-p}$ and the log-odds are $\log\left(\frac{p}{1-p}\right)$.

Your $S(x)$ is the inverse function to this, so $S\left(\log\left(\frac{p}{1-p}\right)\right)=p$ and $\log\left(\frac{S(x)}{1-S(x)}\right)=x$.

It turns log-odds into probabilities.

Henry
  • 30,848
  • 1
  • 63
  • 107