This is probably a silly question, but is probability theory the study of functions that integrate/sum to one?
EDIT. I forgot non-negativity. So is probability theory the study of non-negative functions that integrate/sum to one?
This is probably a silly question, but is probability theory the study of functions that integrate/sum to one?
EDIT. I forgot non-negativity. So is probability theory the study of non-negative functions that integrate/sum to one?
At a purely formal level, one could call probability theory the study of measure spaces with total measure one, but that would be like calling number theory the study of strings of digits which terminate
-- from Terry Tao's Topics in random matrix theory.
I think this is the really fundamental thing. If we've got a probability space $(\Omega, \mathscr F, P)$ and a random variable $X : \Omega \to \mathbb R$ with pushforward measure $P_X := P \circ X^{-1}$, then the reason a density $f = \frac{\text d P_X}{\text d\mu}$ integrates to one is because $P(\Omega) = 1$. And that's more fundamental than pdfs vs pmfs.
Here's the proof: $$ \int_{\mathbb R} f \,\text d\mu = \int_{\mathbb R} \,\text dP_X = P_X(\mathbb R) = P\left(\{\omega \in \Omega : X(\omega) \in \mathbb R\}\right) = P(\Omega) = 1. $$
This is almost a rephrasing of AdamO's answer (+1) because all CDFs are càdlàg, and there's a one-to-one relationship between the set of CDFs on $\mathbb R$ and the set of all probability measures on $(\mathbb R, \mathbb B)$, but since the CDF of a RV is defined in terms of its distribution, I view probability spaces as the place to "start" with this kind of endeavor.
I'm updating to elaborate on the correspondence between CDFs and probability measures and how both are reasonable answers for this question.
We begin by starting with two probability measures and analyzing the corresponding CDFs. We conclude by instead starting with a CDF and looking at the measure induced by it.
Let $Q$ and $R$ be probability measures on $(\mathbb R, \mathbb B)$ and let $F_Q$ and $F_R$ be their respective CDFs (i.e. $F_Q(a) = Q\left((-\infty, a]\right)$ and similarly for $R$). $Q$ and $R$ both would represent pushforward measures of random variables (i.e. distributions) but it doesn't actually matter where they came from for this.
The key idea is this: if $Q$ and $R$ agree on a rich enough collection of sets, then they agree on the $\sigma$-algebra generated by those sets. Intuitively, if we've got a well-behaved collection of events that, through a countable number of complements, intersections, and unions forms all of $\mathbb B$, then agreeing on all of those sets leaves no wiggle room for disagreeing on any Borel set.
Let's formalize that. Let $\mathscr S = \{(-\infty, a] : a \in \mathbb R\}$ and let $\mathcal L = \{A \subseteq \mathbb R : Q(A) = R(A)\}$, i.e. $\mathcal L$ is the subset of $\mathcal P(\mathbb R)$ on which $Q$ and $R$ agree (and are defined). Note that we're allowing for them to agree on non-Borel sets since $\mathcal L$ as defined isn't necessarily a subset of $\mathbb B$. Our goal is to show that $\mathbb B \subseteq \mathcal L$.
It turns out that $\sigma(\mathscr S)$ (the $\sigma$-algebra generated by $\mathscr S$) is in fact $\mathbb B$, so we hope that $\mathscr S$ is a sufficiently big collection of events that if $Q = R$ everywhere on $\mathscr S$ then they're forced to be equal on all of $\mathbb B$.
Note that $\mathscr S$ is closed under finite intersections, and that $\mathcal L$ is closed under complements and countable disjoint intersections (this follows from $\sigma$-additivity). This means that $\mathscr S$ is a $\pi$-system and $\mathcal L$ is a $\lambda$-system. By the $\pi$-$\lambda$ theorem we therefore have that $\sigma(S) = \mathbb B \subseteq \mathcal L$. The elements of $\mathscr S$ are nowhere near being as complex as an arbitrary Borel set, but because any Borel set can be formed from a countable number of complements, unions, and intersections of elements of $\mathscr S$, if there is not a single disagreement between $Q$ and $R$ on elements of $\mathscr S$ then this will be followed through to there being no disagreements on any $B \in \mathbb B$.
We have just shown that if $F_Q = F_R$ then $Q = R$ (on $\mathbb B$), which means that the map $Q \mapsto F_Q$ from $\mathscr P := \{P : P \text { is a probability measure on } (\mathbb R, \mathbb B)\}$ to $\mathcal F := \{F : \mathbb R \to \mathbb R : F \text { is a CDF}\}$ is an injection.
Now if we want to think about going the other direction, we want to start with a CDF $F$ and show that there is a unique probability measure $Q$ such that $F(a) = Q\left((-\infty, a]\right)$. This will establish that our mapping $Q \mapsto F_Q$ is in fact a bijection. For this direction, we define $F$ without any reference to probability or measures.
We first define a Stieltjes measure function as a function $G : \mathbb R \to \mathbb R$ such that
(and note how being càdlàg follows from this definition, but because of the extra non-decreasing constraint "most" càdlàg functions are not Stieltjes measure functions).
It can be shown that each Stieltjes function $G$ induces a unique measure $\mu$ on $(\mathbb R, \mathbb B)$ defined by $$ \mu\left((a, b]\right) = G(b) - G(a) $$ (see e.g. Durrett's Probability and Random Processes for details on this). For example, the Lebesgue measure is induced by $G(x) = x$.
Now noting that a CDF is a Stieltjes function $F$ with the additional properties that $\lim_{x\to-\infty} F(x) := F(-\infty) = 0$ and $\lim_{x\to\infty} F(x) := F(\infty) = 1$, we can apply that result to show that for every CDF $F$ we get a unique measure $Q$ on $(\mathbb R, \mathbb B)$ defined by $$ Q\left((a, b]\right) = F(b) - F(a). $$
Note how $Q\left((-\infty, a]\right) = F(a) - F(-\infty) = F(a)$ and $Q\left((-\infty, -\infty]\right) = F(\infty) - F(-\infty) = 1$ so $Q$ is a probability measure and is exactly the one we would have used to define $F$ if we were going the other direction.
All together we have now seen that the mapping $Q \mapsto F_Q$ is 1-1 and onto so we really do have a bijection between $\mathscr P$ and $\mathcal F$. Bringing this back to the actual question, this shows that we could equivalently hold up either CDFs or probability measures as our object which we declare probability to be the study of (while also recognizing that this is a somewhat facetious endeavor). I personally still prefer probability spaces because I feel like the theory more naturally flows in that direction but CDFs are not "wrong".
No; the Cantor distribution is just such a counterexample. It's a random variable, but it has no density. It has a distribution function, however. I would say, therefore, that probability theory is the study of càdlàg functions, inclusive of the Cantor DF, that have left limits of 0 and right limits of 1.
I'm sure you'll get good answers, but will give you a slightly different perspective here.
You may have heard mathematicians saying that physics is pretty much mathematics, or just an application of mathematics to the most basic laws of nature. Some mathematicians (many?) actually do believe that this the case. I've heard that over and over in university. In this regard you're asking a similar question, though not as wide sweeping as this one.
Physicist usually don't bother even responding to this statement: it's too obvious to them that it's not true. However, if you try to respond it becomes clear that the answer is not so trivial, if you want to make it convincing.
My answer is that physics is not just a bunch of models and equations and theories. It's a field with its own set of approaches and tools and heuristics and the ways of thinking. That's one reason why although Poincare developed relativity theory before Einstein, he didn't realize all the implications and didn't pursue to get everyone on board. Einstein did, because he was a physicist and he got what it meant immediately. I'm not a fan of the guy, but his work on Brownian motion is another example of how a physicist builds a mathematical model. That paper is amazing, and is filled with intuition and traces of thinking that are unmistakenly physics-ey.
So, my answer to you is that even if it were the case that probability deals with the kind of functions you described, it would still not have been the study of those function. Nor it is a measure theory applied to some subclass of measures. Probability theory is the distinct field that studies probabilities, it's linked to a natural world through radioactive decay and quantum mechanics and gases etc. If it happens so that certain functions seem to be suitable to model probabilities, then we'll use them and study their properties too, but while doings so we'll keep an eye on the main prize - the probabilities.
Well, partially true, it lacks a second condition. Negative probabilities do not make sense. Hence, these functions have to satisfy two conditions:
Continuous distributions: $$ \int_{\mathcal{D}}f(x) dx = 1 \quad \text{and} \quad f(x)>0 \; \forall x \in \mathcal{D}$$
Discrete distributions: $$ \sum_{x \in \mathcal{D}}P(x) = 1 \quad \text{and} \quad 0<P(x) \leq 1 \; \forall x \in \mathcal{D}$$
Where $\mathcal{D}$ is the domain where probability distribution is defined.
I would say no, that's not what probability theory fundamentally is, but I would say it for different reasons than the other answers.
Fundamentally, I would say, probability theory is the study of two things:
Stochastic processes, and
Bayesian inference.
Stochastic processes includes things like rolling dice, drawing balls from urns, etc., as well as the more sophisticated models found in physics and mathematics. Bayesian inference is reasoning under uncertainty, using probabilities to represent the value of unknown quantities.
These two things are more closely related than they might at first appear. One reason we can study them under the same umbrella is that important aspects of both of them can be represented as non-negative functions that sum/integrate to one. But probability isn't just the study of those functions - their interpretation in terms of random processes and inference is also an important part of it.
For example, probability theory includes concepts such as conditional probabilities and random variables, and quantities such as the entropy, the mutual information, and the expectation and variance of random variables. While one could define these things purely in terms of normalised non-negative functions, the motivation for this would seem pretty weird without the interpretation in terms of random processes and inference.
Moreover, one sometimes comes across concepts in probability theory, particularly on the inference side, which cannot be expressed in terms of a non-negative function that normalises to one. The so-called "improper priors" come to mind here, and AdamO gave the Cantor distribution as another example.
There certainly are some areas of probability theory in which the main interest is in the mathematical properties of normalised non-negative functions, for which the two application domains I mentioned are not important. When this is the case, we often call it measure theory rather than probability theory. But probability theory is also - indeed, I would say mostly - an applied field, and the applications of probability distributions are in themselves a non-trivial component of the field.