Why does a Cumulative Distribution Function (CDF) uniquely define a distribution?

Question

I have always been told a CDF is unique however a PDF/PMF is not unique, why is that ? Can you give an example where a PDF/PMF is not unique ?

Concerning uniqueness, you might like to ponder the difference between the PDF of a uniform distribution on $[0,1]$ and a uniform distribution on its interior, $(0,1)$. Another fun exercise--which addresses the question of whether a PDF even exists--is to think about what the PDF of a distribution over the rational numbers would look like. For instance, let $\Pr(j2^{-i})=2^{1-2i}$ whenever $0\lt j2^{-i}\lt 1$, $i\ge 1$, and $j$ is odd. — whuber, Feb 06 '15 at 23:43
Not all distributions even have a PDF, or have a PMF, while looking at the CDF gives a unifying view to things. Continuous variables have smooth-looking CDFs, discrete variables have a "staircase", and some CDFs are mixed. — Silverfish, Feb 07 '15 at 00:01
To address the title (perhaps somewhat loosely), the CDF defines a distribution because the CDF (or equivalently just DF/'distribution function'; the "C" acts only to clarify that's the object we're talking about) is what the term 'distribution' literally refers to; the "D" is the clue on that part. That it's unique follows from the "F" -- functions are single-valued, so if two distribution functions are identical the object they define is the same; if the DFs differed anywhere the thing they are the definition of would be different at those points. Is that tautology? I think it is. — Glen_b, Feb 07 '15 at 02:31
@Null: Are you looking for a high-level ("intuitive") answer or something that will address the issue somewhat more rigorously from a mathematical point of view? (Which is not to suggest that these are oppositional viewpoints!) — cardinal, Feb 07 '15 at 16:51
@cardinal It would be great if you can provide an answer that involves some rigorous math. — DKangeyan, Feb 07 '15 at 23:28
http://www.math.uah.edu/stat/dist/Density.html this seems like it might explain the answer, if I could understand it! Starting from the beginning of the chapter might help though. It took me a while to find it so just thought I'd share it, though hopefully someone provides you a direct answer. Interesting question. — HFBrowning, Feb 09 '15 at 20:24
@Glen_b It's tautological only to the trained intuition. A distribution function $F$ only gives probabilities of the form $F(x)=\Pr\{\omega\in\Omega\,|\,X(\omega)\le x\}$ whereas the entire *distribution* specifies probabilities of the form $\Pr(\{\omega\in\Omega\,|\,X(\omega)\in\mathcal{B}\}$ for arbitrary measurable sets $\mathcal{B}\subset\mathbb R$. You have to show $F$ determines the distribution. As NicholasB points out, that's a matter of extending a pre-measure from a semi-ring (of half-open intervals), $\mu((a,b])=F(b)-F(a)$, to the full Lebesgue sigma-field and showing it's unique. — whuber, Feb 10 '15 at 20:51

score 15 · Accepted Answer · edited Feb 10 '15 at 20:15

Let us recall some things. Let $(\Omega,A,P)$ be a probability space, $\Omega$ is our sample set, $A$ is our $\sigma$-algebra, and $P$ is a probability function defined on $A$. A random variable is a measurable function $X:\Omega \to \mathbb{R}$ i.e. $X^{-1}(S) \in A$ for any Lebesgue measurable subset in $\mathbb{R}$. If you are not familiar with this concept then everything I say afterwards will not make any sense.

Anytime we have a random variable, $X:\Omega \to \mathbb{R}$, it induces a probability measure $X'$ on $\mathbb{R}$ by the categorical pushforward. In other words, $X'(S) = P(X^{-1}(S))$. It is trivial to check that $X'$ is probability measure on $\mathbb{R}$. We call $X'$ the distribution of $X$.

Now related to this concept is something called the distribution function of a function variable. Given a random variable $X:\Omega \to \mathbb{R}$ we define $F(x) = P(X\leq x)$. Distribution functions $F:\mathbb{R} \to [0,1]$ have the following properties:

$F$ is right-continuous.
$F$ is non-decreasing
$F(\infty) = 1$ and $F(-\infty)=0$.

Clearly random variables which are equal have the same distribution and distribution function.

To reverse the process and obtain a measure with the given distribution function is pretty technical. Let us say you are given a distribution function $F(x)$. Define $\mu(a,b] = F(b) - F(a)$. You have to show that $\mu$ is a measure on the semi-algebra of intervals of the $(a,b]$. Afterwards you can apply the Carathéodory extension theorem to extend $\mu$ to a probability measure on $\mathbb{R}$.

This is a good start to an answer, but may be unintentionally obscuring the matter at hand a bit. The main issue seems to be showing that two measures with the same distribution function are, in fact, equal. This requires nothing more than Dynkin's $\pi $-$\lambda $ theorem and the fact that sets of the form $(-\infty, b] $ form a $\pi $-system that generates the Borel $\sigma $-algebra. Then the nonuniqueness of a density (assuming it exists!) can be addressed and contrasted with the above. — cardinal, Feb 10 '15 at 19:37
(One additional minor quibble: Random variables are usually defined in terms of Borel sets rather than Lebesgue sets.) I think with some minor edits this answer will become quite clear. :-) — cardinal, Feb 10 '15 at 19:38
@cardinal I think of analysis first, probability second. Therefore, this may explain why I prefer to think of Lebesgue sets. In either case it does not affect what was said. — Nicolas Bourbaki, Feb 10 '15 at 20:17

DWin · Answer 2 · 2015-02-13T22:37:16.847

To answer the request for an example of two densities with the same integral (i.e. have the same distribution function) consider these functions defined on the real numbers:

 f(x) = 1 ; when x is odd integer
 f(x) = exp(-x^2)  ; elsewhere

and then;

 f2(x) = 1  ; when x is even integer
 f2(x) = exp(-x^2) ;  elsewhere

They are not equal at all x, but are both densities for the same distribution, hence densities are not uniquely determined by the (cumulative) distribution. When densities with a real domain are different only on a countable set of x values, then the integrals will be the same. Mathematical analysis is not really for the faint of heart or the determinately concrete mind.

score 0 · Answer 3 · answered Feb 12 '15 at 07:35

I disagree with the statement, "the probability distribution function does not uniquely determine a probability measure", that you say in your opening question. It does uniquely determine it.

Let $f_1,f_2:\mathbb{R}\to [0,\infty)$ be two probability mass functions. If, $$ \int_E f_1 = \int_E f_2 $$ For any measurable set $E$ then $f_1=f_2$ almost everywhere. This uniquely determines the pdf (because in analysis we do not care if they disagree on a set of measure zero).

We can rewrite the above integral into, $$ \int_E g = 0 $$ Where $g=f_1-f_2$ is an integrable function.

Define $E = \{ x \in \mathbb{R} ~ | ~ g \geq 0 \}$, so $\int_E g = 0$. We use the well-known theorem that if an integral of a non-negative function is zero then the function is zero almost everywhere. In particular, $g=0$ a.e. on $E$. So $f_1 = f_2$ a.e. on $E$. Now repeat the argument in the other direction with $F = \{ x\in \mathbb{R} ~ | ~ g \leq 0 \}$. We will get that $f_1 = f_2$ a.e on $F$. Thus, $f_1 = f_2$ a.e. on $E\cup F = \mathbb{R}$.

Why does a Cumulative Distribution Function (CDF) uniquely define a distribution?

3 Answers3