You need to consider what is actual data, and what are the parameters: not just some $A$ and $B$.
The process is connected with the sampling procedure and because of that we already have some data likelihood.
If you ever wonder what is the probability-likelyhood difference you can think of it like likelihood is something that we already evidenced. Data happened and we know what is the likelihood of the data.
We always speak of the likelihood of the data, I am not aware we speak of likelyhood of the parameters, since we usually need to find the parameters in some way, say expectation maximization algorithm.
So:
- $\mathbb P(\theta \mid \mathsf {data})$ is the posterior
- $\mathbb P(\mathsf {data} \mid \theta)$ is the likelihood
You have chicken and the egg problem and if you ever thought about that you already know Bayesian statistics. Prior is the regulizer $\mathbb P(\theta)$ present to create the new posterior based on the likelihood, but then the posterior can become the new prior, if you iterate the procedure.
You have the normalization constant to make the posterior the true probability that adds to 1 and both the prior and posterior should be PDF or PMF, but if we already used the normalization constant to make the posterior the PDF or PMF then likelihood may not be the probability distribution.
You can always make the constant $C$ and make it add to one, but out of the Bayes formula.
$$\mathbb P(\theta \mid X) =\frac{\mathbb P(X \mid \theta) \mathbb P(\theta)}{\mathbb P(X)}$$
* in here $X$ is the data
Fisher invented the likelihood as conditional probability data conditioned with some parameters theta that may not add to one, unless by accident.
On the other hand posterior is also conditional but it adds to one always.