According to the Wikipedia article Likelihood function, the likelihood function is defined as:
$$ \mathcal{L}(\theta|x)=P(x|\theta), $$
with parameters $\theta$ and observed data $x$. This equals $p(x|\theta)$ or $p_\theta(x)$ depending on notation and whether $\theta$ is treated as random variable or fixed value.
The notation $\mathcal{L}(\theta|x)$ seems like an unnecessary abstraction to me. Is there any benefit to using $\mathcal{L}(\theta|x)$, or could one equivalently use $P(x|\theta)$? Why was $\mathcal{L}(\theta|x)$ introduced?