Consider the following "facts" about Bayes theorem and likelihood:
Bayes theorem, written generically as $P(A|B) = \frac{ P(B|A) P(A) }{ P(B) }$ involves conditional and marginal probabilities. Focus on $P(B|A)$. Wiki Bayes theorem says this is a conditional probability (or conditional probability density in the continuous case). This seems quite clear in the alternate expression $P(B|A) P(A) = P(A|B) P(B)$.
In the Bayes theorem, $P(B|A)$ is called the likelihood.
The likelihood is $P(B|A)$ viewed as a function of $A$, not of $B$. It is not a conditional probability because it does not integrate to one. See What is the reason that a likelihood function is not a pdf? , or Bishop Pattern Recognition & Machine Learing book p.22, "Note that the likelihood is not a probability distribution over w, and its integral with respect to w does not (necessarily) equal one."
There is a problem here, one of these three facts must be wrong, or else I do not understand something. How can the likelihood in the Bayes theorem be a conditional probability, and also not a conditional probability?
I am not sure (since I do not understand!), but perhaps an answer would be to explain how to view the Bayes equation in terms of what is variable and what is fixed, and how probabilities (and non-probabilities -- the likelihood) can combine in a "type consistent" way. For example, is it accurate to say that in the Bayes equation $P(A|B) = \cdots$ we should regard $B$ as fixed and $A$ as variable?
And if $P(B|A)$, the likelihood, is not a conditional probability, then the form of Bayes is $$ \text{conditional probability} = \frac{ \text{other} \cdot \text{probability} }{ \text{probability} } $$ (or if dealing with continuous variables, $$ \text{conditional probability density} = \frac{ \text{other} \cdot \text{probability density} }{ \text{probability density} } $$ ) where "other" is the type of the likelihood (not a conditional probability density). Is there a rule that $$ \text{other} \cdot \text{probability} = \text{probability} $$ ? To me this seems wrong: multiplying a probability by a thing (likelihood) with arbitrarily large values will cause the result to not integrate to 1.
Aside, I have tried to asked this question [recently], but it was closed as a duplicate. However, I think the people deciding it was a duplicate did not understand the question, which is specifically about the likelihood appearing in Bayes theorem. I hope this version is more clear! Please read the whole question before marking it as a duplicate, thank you!