I recently stumbled into a similar question.
To answer why an asymmetric divergence can be more favourable than a symmetric divergence, consider a scenario where you want to quantify the quality of a proposal distribution used in importance sampling (IS). If you are unfamiliar with IS, the key idea here is that to design an efficient IS scheme, your proposal distribution should have heavier tails than the target distribution.
Denote two distributions $H=\text{Normal}(0, 25)$ and $L=\text{Normal}(0, 1)$. Suppose you target $H$ with IS, using $L$ as the proposal distribution. To quantify the quality of your proposal distribution, you might compute the Jensen-Shannon (JS) divergence of $L,H$, and the Kullback-Leibler (KS) divergence of $L$ from $H$ and obtain some values. Both values should give you some sense of how good your proposal distribution $L$ is. Nothing to see here yet. However, consider reversing the setup, i.e., target $L$ with IS using $H$ as the proposal distribution. Here, the JS divergence would be the same due to its symmetric property, while KL of $H$ from $L$ would be much lower. In short, we expected using $H$ to target $L$ to be OK, and $L$ to target $H$ is not OK. KL divergence aligns with our expectation; $\text{KL}(H || L) > \text{KL}(L ||H)$. JS divergence doesn't.
This asymmetric property aligns with our goal in that it can correctly, loosely speaking, account for the direction of discrepancy between two distributions.
Another factor to consider is that sometimes it can be significantly more computationally challenging to compute JS divergence than KS divergence.