After some more research regarding this topic, I have found the correct answer. It turns out that what I was trying to describe and implement is a variant of the Multiple Hypothesis Tracker (MHT) [1, 2].
Answer to my question:
The correct way to calculate likelihood of a measurement $\mathbf{z}_k$ at time step $k$ given a set of previous measurements $\textbf{Z}_{k-1} = \{ \textbf{z}_{0}, \ldots , \textbf{z}_{k-1} \}$ is (taken from wikipedia [1])
$$
p(\textbf{z}) = \prod^T_{k=0}\left( p(\textbf{z}_k), | p(\textbf{z}_{0}), \ldots , p(\textbf{z}_{k-1}) \right) = \prod^T_{k=0} \mathcal{N}\left( \textbf{z}_k; \textbf{H}_k \hat{\textbf{x}}_{k|k-1}, \textbf{S}_k \right),
$$
where $\hat{\textbf{x}}_{k|k-1}$ is the predicted state vector of the Kalman filter and $\textbf{H}_k$ is the matrix, mapping states of the KF to measurements (part of the system model). $\mathbf{S}_k$ is the innovation covariance, which is calculated as part of the Kalman filter update step.
This can be calculated iteratively like the Kalman filter, but it is not very numerically stable, which is why a log-likelihood $l_k = \log{p(\mathbf{z}_k)}$ is usually used. The iterative update equation for the log-likelihood is (again from wikipedia [1])
$$
l_k = l_{k-1} - \frac{1}{2}\left( \tilde{\mathbf{y}}_k^T \mathbf{S}_k^{-1} \tilde{\mathbf{y}}_k + \log{|\mathbf{S}_k|} + d_y\log{2\pi} \right),
$$
where $\tilde{\mathbf{y}}_k$ is the innovation vector, which is calculated as part of the KF update step, and $d_y$ is number of dimensions of the measurement.
Notes regarding using other metrics:
The metrics for measuring a distance of distributions (such as the Mahalanobis distance or Kullback-Leibler divergence) are not well suitable for this problem, since they usually describe the 'similarity' of two random distributions, whereas in this case it is desired to express the likelihood of a measurement being generated by a stochastic system. They do not take into account the measurement model, but only the two random distributions.
Further reading
For more information on this topic, I can recommend the Wikipedia article on Kalman Filter [1] and a great summary of multitarget tracking algorithms and data association problems [2].
[1]: https://en.wikipedia.org/wiki/Kalman_filter#Marginal_likelihood
[2]: The Probabilistic Data Association Filter ESTIMATION IN THE PRESENCE OF MEASUREMENT ORIGIN UNCERTAINTY, Yaakov Bar-shalom and Fred Daum and Jim Huang, 2009