I have been reading Begg, Welsh, and Bratvold (2014), which is an excellent and lucid discussion of the distinction between uncertainty and variability (from a petro/geostatistics perspective). The abstract defines them:
"Uncertainty means we do not know the value (or outcome) of some quantity, ... Variability refers to the multiple values a quantity has at different locations, times or instances"
and describes how they are captured:
"Uncertianty [sic] is quantified by a probability distribution which depends upon our state of information about the likelihood of what the single, true value of the uncertain quantity is. Variability is quantified by a distribution of frequencies of multiple instances of the quantity, derived from observed data.
This makes sense to me. Variability within a population is defined by some frequency mass function (discrete case) or distribution function (continuous case). If we have perfect information about the whole population, there is no uncertainty, and these functions can be exactly specified. If we do NOT have perfect information about the population (e.g. we only have access to a limited sample, then there is some uncertainty about these functions, which can be described as probability distributions on the estimated frequencies or distribution parameters.
My understanding is that this population-level variability collapses to uncertainty when we ask "what is the true value of a particular (unmeasured) element of the population?". There is a semantic distinction between the two sources of uncertainty here - the uncertainty on the population variability is epistemic uncertainty, and the contribution of population variability to sample uncertainty is aleatoric uncertainty.
Although it is not explicitly framed as such, this strikes me as an very Bayesian interpretation of probability, as they describe it as solely interpretable as a personal measure of belief in the truth value of a given well-defined statement.
My understanding of the Frequentist a interpretation of probability is that it is a representation of the true proportion of any given statement, if the whole population could be measured (or the limit of the mean of sample estimates as the number of samples goes to infinity).
I can kind of see how this makes sense (though not as much as the bayesian interpretation), but I'm finding it hard to make a clear distinction between uncertainty and variability in this framework provided by Begg, Welsh, and Bratvold (2014). I think that frequentist probability actually represents population variability here, but in that case, what does uncertainty represent? Just the potential wrongness of sample estimates? Or something else? And how is it quantified? By confidence interval widths? I feel like I am missing some nuance here.
References
- Begg, Steve H., Matthew B. Welsh, and Reidar B. Bratvold. “Uncertainty vs. Variability: What’s the Difference and Why Is It Important?” In Day 1 Mon, May 19, 2014, D011S003R002. Houston, Texas: SPE, 2014. https://doi.org/10.2118/169850-MS.