I was wondering how can we show that cross-entropy is equivalent to maximising log likelihood of a training assuming the data can be modeled by this distribution:
Asked
Active
Viewed 35 times
0
-
1Does this answer your question? [the relationship between maximizing the likelihood and minimizing the cross-entropy](https://stats.stackexchange.com/questions/364216/the-relationship-between-maximizing-the-likelihood-and-minimizing-the-cross-entr) – Arya McCarthy Apr 14 '21 at 00:14
-
Also https://stats.stackexchange.com/questions/468818/machine-learning-negative-log-likelihood-vs-cross-entropy https://stats.stackexchange.com/questions/297749/how-meaningful-is-the-connection-between-mle-and-cross-entropy-in-deep-learning https://stats.stackexchange.com/questions/428937/mle-and-cross-entropy-for-conditional-probabilities – Arya McCarthy Apr 14 '21 at 00:15
-
I have seen that post but I am not sure if it still applies with the distribution provided for the training data above. Does it? – stgstu Apr 14 '21 at 00:22
-
1The top answer is precisely about the Bernoulli pmf you showed. – Arya McCarthy Apr 14 '21 at 00:24
-
1Please don't vandalize content on this website. If you wish to delete your question, you can do so by clicking the [delete] button below the question body. – Sycorax Apr 14 '21 at 00:29