In maximum likelihood estimation, we maximise the likelihood.
I don't understand how this is possibly: for any reasonable dataset, the likelihood of hitting that EXACT data set is obviously zero! So how can we ever hope to maximize it?
For example, take data for a bunch of coin flips. The coin is fair, and you want to estimate the likelihood at $p = 0.5$. This is obviously the optimal value, since the coin is fair.
Yet if you have BILLIONS of data points, your likelihood will be $\prod_{i=1}^{BILLIONS} p_i^{x_i}(1-p_i)^{1 - x_i}$.
It doesn't matter how your data looks like: that product will ALWAYS be zero. There's no way around it. You're multiplying BILLIONS of small numbers: so the value is zero.
Numerically then (which is how most likelihood methods are implemented) how one would possibly maximize a value that is identically zero?