Likelihood values from Sigmoid

Asked Jul 06 '21 at 18:20

Active Jul 06 '21 at 19:11

Viewed 51 times

There are multiple doubts of mine associated around this theme:

In MLE, we try to find the PDF parameters ($\theta$) which maximise the likelihood of the observed data ($L(\theta | data)$). To get likelihood for a given data point for $\theta = \theta_1$ we simply evaluate the PDF for that data point. Now, we know that probability at any one particular point of a PDF is $0$. What is the correct reasoning behind evaluating the PDF at $x=x_1$ for its likelihood?
Clearly, the Sigmoid Function is not a PDF. But in the MLE estimates of Logistic Regression we see Sigmoid being used as if it is a PDF. Is my understanding correct ? If not, how to see it correctly? If yes, what is the reason behind it?
This is related to the previous question. I have seen at multiple places that people take the Sigmoid to infer probability. However there is not any constraint put to ensure that sum of all those probabilities must be $1$. What is the correct explanation behind it?

edited Jul 06 '21 at 18:35

asked Jul 06 '21 at 18:20

Aroonalok

How is the sigmoid being used as a PDF? Where do you see people using sigmoid to infer probability? – Dave Jul 06 '21 at 18:24
@Dave: https://arxiv.org/pdf/1402.3722.pdf View page 3. – Aroonalok Jul 06 '21 at 18:31
@Dave : https://www.dropbox.com/s/qiq2c85cle9ydb6/Chapter3.pdf?dl=0 View Page 7, last but 1th paragraph. – Aroonalok Jul 06 '21 at 18:39
These seem to be three very different questions, thematically linked together by the very broad question "Where does logistic regression come from?" I think the duplicates address this question. – Sycorax Jul 06 '21 at 19:13

0 Answers0