I don't believe that (Why does increasing the sample size lower the variance?) appropriately handles my question!
The linked questions explains why any addition of random variables (all iid) produces a new RV whose variance is less than the variance of any one RV. In this case, the effect of adding a "training example" is to add a random variable to the summation representing the average: $$ Y = \frac{\sum_{i = 1}^{n} X_{i}}{n}$$
The effect of adding a training example in an ML context has a much different effect (What is meant by the variance of *functions* in *Introduction to Statistical Learning*?)
$$ \mathcal{A} : \mathcal{T} \rightarrow \{f \mid f: X \rightarrow \mathbb{R} \} $$
A learning algorithm is a mapping from any subset, $\tau$, of the set of all training examples to some predictor (which was trained using the algorithm $A$). Each subset of the training set has a certain probability of being selected. Thus, to add another training example to the set of training examples is to, perhaps, fundamentally change the probability of some subsets being selected. This, in turn, alters $E[A_{x_{0}}]$ (the expectation of random variable: $A_{x_{0}}: T \rightarrow f_{t}(x_{0})$ where $f_{t}$ is the predictor given by a particular choice of subset $t \in T$). As a result, this also changes the variance of $A_{x_{0}}$.
Andrew Ng's ML course makes the claim that increasing the training examples in some training set (some subset of the set of all training examples) decreases the variance of the learning algorithm. Can this either be rigorously shown using the above definition of variance or given some intuitive grounding? I can't find any explanation in his course as to why this might be true, only a learning curve showing the phenomenon.