I am trying to understand different forms of loss functions. I get confused with the terms cross entropy-loss and negative log-likelihood losses. I have seen the two following definitions of cross entropy-loss and negative log-likelihood losses (both terms are used for both formulas). The first one is (in the following paper: Improved Knowledge Graph Embedding using Background Taxonomic Information - Fatemi et al.: in the Objective Function and Training Section):
- $-\sum_{n=1}^{N}\{ t_n\log(y_n) + (1 - t_n)\log (1-y_n) \}$, where $t_n$ is the label (either 0 or 1) and $y_n$ is the probability
and the second one is (in the following paper: Low-Dimensional Hyperbolic Knowledge Graph Embeddings - Chami et al.: Equation 11):
- $\sum_{n=1}^{N} \log(1+\exp(-t_ny_n)) $, where $t_n$ is the label (BUT either -1 or 1) AND $y_n$ is NOT a probability but a similarity score (distance in this case)
I have changed the notation so it is the same for both formulas. Apparently as stated in the following post, cross entropy loss and negative log likelihood are equivalent. But are the above two formulas the same, I don't think so? Where is the difference? Is it because one time they use probabilities and the other time just a similarity score?
Thank you for your help! I really appreciate it! :D
(There are quite a few papers using those two formulas, the two above were just an example)