Conflicting "facts" about the likelihood employed in Bayes theorem

Question

Consider the following "facts" about Bayes theorem and likelihood:

Bayes theorem, written generically as $P(A|B) = \frac{ P(B|A) P(A) }{ P(B) }$ involves conditional and marginal probabilities. Focus on $P(B|A)$. Wiki Bayes theorem says this is a conditional probability (or conditional probability density in the continuous case). This seems quite clear in the alternate expression $P(B|A) P(A) = P(A|B) P(B)$.
In the Bayes theorem, $P(B|A)$ is called the likelihood.
The likelihood is $P(B|A)$ viewed as a function of $A$, not of $B$. It is not a conditional probability because it does not integrate to one. See What is the reason that a likelihood function is not a pdf? , or Bishop Pattern Recognition & Machine Learing book p.22, "Note that the likelihood is not a probability distribution over w, and its integral with respect to w does not (necessarily) equal one."

There is a problem here, one of these three facts must be wrong, or else I do not understand something. How can the likelihood in the Bayes theorem be a conditional probability, and also not a conditional probability?

I am not sure (since I do not understand!), but perhaps an answer would be to explain how to view the Bayes equation in terms of what is variable and what is fixed, and how probabilities (and non-probabilities -- the likelihood) can combine in a "type consistent" way. For example, is it accurate to say that in the Bayes equation $P(A|B) = \cdots$ we should regard $B$ as fixed and $A$ as variable?

And if $P(B|A)$, the likelihood, is not a conditional probability, then the form of Bayes is $$ \text{conditional probability} = \frac{ \text{other} \cdot \text{probability} }{ \text{probability} } $$ (or if dealing with continuous variables, $$ \text{conditional probability density} = \frac{ \text{other} \cdot \text{probability density} }{ \text{probability density} } $$ ) where "other" is the type of the likelihood (not a conditional probability density). Is there a rule that $$ \text{other} \cdot \text{probability} = \text{probability} $$ ? To me this seems wrong: multiplying a probability by a thing (likelihood) with arbitrarily large values will cause the result to not integrate to 1.

Aside, I have tried to asked this question [recently], but it was closed as a duplicate. However, I think the people deciding it was a duplicate did not understand the question, which is specifically about the likelihood appearing in Bayes theorem. I hope this version is more clear! Please read the whole question before marking it as a duplicate, thank you!

The conditional probability $f(x;\theta)$ can be viewed as a function of $x|\theta$ or rewritten as $f(\theta;x)$, in which case it's a likelihood and a function of $\theta|x$. However, the algebraic expression remains the same, e.g., $\theta \exp(-x\theta)$. What changes is what we hold fixed (condition on) and what we allow to vary. — jbowman, Jan 04 '19 at 05:09
I am definitely missing somehing big. I feel like I agree with most of your comment, but do not see how it answers the question. When we change what is held fixed, that makes a big difference (I think). In the Bayes thm case, which thing is held fixed? Surely it matters? If $x$ is held fixed, then it is not a conditional probability, and fact #1 is violated. If $\theta$ is held fixed, then it is not a likelihood and fact #2 is violated. — basicidea, Jan 04 '19 at 08:14
Also I don't understand the $\theta \exp( - x \theta)$. Is this a particular example? — basicidea, Jan 04 '19 at 08:15
The likelihood in Bayes theorem *is* a conditional probability. See the linked thread for lengthy discussion in answers & comments. — Tim, Jan 04 '19 at 08:20

Conflicting "facts" about the likelihood employed in Bayes theorem

0 Answers0