Intuitively, @dsaxton's answer provides the correct logic. Let me just "translate" it to math language.
Suppose the sample $Y_1, \ldots, Y_n \text{ i.i.d.} \sim f_\theta(y)$, $\theta \in \Theta$, where $\Theta$ is the parameter space and $f_\theta(\cdot)$ are density functions. After a vector of observations $y = (y_1, \ldots, y_n)'$ has been made, the maximum likelihood principle aims to look for an estimator $\hat{\theta} \in \Theta$ such that for any $\delta > 0$, the probability
\begin{equation}
P_{\hat{\theta}}[(Y_1, \ldots, Y_n) \in (y_1 - \delta, y_1 + \delta) \times \cdots \times (y_n - \delta, y_n + \delta)] \tag{1}
\end{equation}
is the maximum over $\theta \in \Theta$, which suggests us considering the
quantity
\begin{equation}
L(\theta, \delta) \equiv P_{\theta}[(Y_1, \ldots, Y_n) \in (y_1 - \delta, y_1 + \delta) \times \cdots \times (y_n - \delta, y_n + \delta)],
\quad \theta \in \Theta.\tag{2}
\end{equation}
Note that from the statistical inference point of view, the probability in $(2)$ should be viewed as a function of $\theta$.
We now simplify $(2)$ by invoking the i.i.d. assumption and that $P_\theta$ has density $f_\theta$. Clearly,
$$L(\theta, \delta) = \prod_{i = 1}^n P_\theta[y_i - \delta < Y_i < y_i + \delta] = \prod_{i = 1}^n \int_{y_i - \delta}^{y_i + \delta}f_\theta(y) dy. \tag{3}$$
What we need to show, thus answer your question is: if $\hat{\theta}$ maximizes $L(\theta, \delta)$ with respect to $\theta \in \Theta$ for any $\delta > 0$, then it also maximizes the so-called likelihood function
$$L(\theta) = \prod_{i = 1}^n f_{\theta}(y_i). \tag{4}$$
So let's start with assuming for any $\delta > 0$,
$$L(\hat{\theta}, \delta) \geq L(\theta, \delta), \quad \forall \theta \in \Theta. \tag{5}$$
Divide both sides of $(5)$ by $(2\delta)^n$, then let $\delta \downarrow 0$
gives
$$\prod_{i = 1}^n f_{\hat{\theta}}(y_i) \geq \prod_{i = 1}^n f_{\theta}(y_i), \quad \forall \theta \in \Theta,$$
which is precisely $L(\hat{\theta}) \geq L(\theta), \forall \theta \in \Theta$.
In other words, maximizing the so-called likelihood function $(4)$ (which is a product of densities) is a necessary condition of carrying out the maximum likelihood principle. From this point of view, the form of densities multiplication makes sense.
Above is just my own interpretation, any comment or critic is very welcomed.