Gregory Gundersen wrote a blog post about this in 2018. He explictly answers the questions:
What does a “random node” mean and what does it mean for backprop to “flow” or not flow through such a node?
The following excerpt should answer your questions:
Undifferentiable expectations
Let’s say we want to take the gradient w.r.t. $\theta$ of the following expectation, $$\mathbb{E}_{p(z)}[f_{\theta}(z)]$$ where $p$ is a density. Provided we can differentiate $f_{\theta}(x)$, we can easily compute the gradient:
$$ \begin{align} \nabla_{\theta} \mathbb{E}_{p(z)}[f_{\theta}(z)]
&= \nabla_{\theta} \Big[ \int_{z} p(z) f_{\theta}(z) dz \Big] \\
&= \int_{z} p(z) \Big[\nabla_{\theta} f_{\theta}(z) \Big] dz \\
&= \mathbb{E}_{p(z)} \Big[\nabla_{\theta} f_{\theta}(z) \Big] \end{align}
$$
In words, the gradient of the expectation is equal to the expectation of the gradient. But what happens if our density $p$ is also parameterized by $\theta$?
$$ \begin{align} \nabla_{\theta} \mathbb{E}_{p_{\theta}(z)}[f_{\theta}(z)] &= \nabla_{\theta} \Big[ \int_{z} p_{\theta}(z) f_{\theta}(z) dz \Big] \\ &= \int_{z} \nabla_{\theta} \Big[ p_{\theta}(z) f_{\theta}(z) \Big] dz \\ &= \int_{z} f_{\theta}(z) \nabla_{\theta} p_{\theta}(z)dz + \int_{z} p_{\theta}(z) \nabla_{\theta} f_{\theta}(z)dz \\ &= \underbrace{\int_{z} f_{\theta}(z) \nabla_{\theta} p_{\theta}(z)}_{\text{What about this?}}dz + \mathbb{E}_{p_{\theta}(z)} \Big[f_{\theta}(z)\Big] \end{align}$$
The first term of the last equation is not guaranteed to be an expectation. Monte Carlo methods require that we can sample from $p_{\theta}(z)$, but not that we can take its gradient. This is not a problem if we have an analytic solution to $\nabla_{\theta}p_{\theta}(z)$, but this is not true in general. 1