In Goodfellow's Deep learning text, it is written
Is this way of defining a probability $p(y=1| x;\theta)$ even legal?
Recall the definition of a probability given a random variable
where $p_X$ is the probability mass function.
Is the logistic function $\sigma$ considered to be a probability mass function?
If so, what plausible process is responsible for generating this probability mass function? (i.e., a process similar to how geometric or Bernoulli pmf are defined)