40

Often, in the course of my (self-)study of statistics, I've met the terminology "$\sigma$-algebra generated by a random variable". I don't understand the definition on Wikipedia, but most importantly I don't get the intuition behind it. Why/when do we need $\sigma-$algebras generated by random variables? What is their meaning? I know the following:

  • a $\sigma$-algebra on a set $\Omega$ is a nonempty collection of subsets of $\Omega$ which contains $\Omega$, is closed under complement and under countable union.
  • we introduce $\sigma$-algebras to build probability spaces on infinite sample spaces. In particular, if $\Omega$ is uncountably infinite, we know there can exist unmeasurable subsets (sets for which we cannot define a probability). Thus, we can't just use the power set of $\Omega$ $\mathcal{P}(\Omega)$ as our set of events $\mathcal{F}$. We need a smaller set, which is still large enough so that we can define the probability of interesting events, and we can talk about convergence of a sequence of random variables.

In short, I think I have a fair intuitive understanding of $\sigma-$algebras. I would like to have a similar understanding for the $\sigma-$algebras generated by random variables: definition, why we need them, intuition, an example...

JohnK
  • 18,298
  • 10
  • 60
  • 103
DeltaIV
  • 15,894
  • 4
  • 62
  • 104
  • 9
    One effective (and intuitively meaningful) characterization is that this is the coarsest sigma-algebra on $\Omega$ that makes the random variable measurable. – whuber Nov 07 '17 at 15:50
  • @whuber coarsest means smallest? In other words, I have my probability space $(\Omega,\mathcal{F},P)$, I have an RV $X:\Omega\to\mathcal{R}$ (which is measurable by definition of random variable), and $\sigma$ is the smallest subset of $\mathcal{F}$ such that $X$ is still measurable. Ok, but this begs the question of what it means intuitively that $X$ is measurable :-) would it make sense to say that we can define the probability of all events of the kind $a\lt X \lt b$ and unions/intersections? – DeltaIV Nov 07 '17 at 16:23
  • 2
    Looking at a single $X$ at a time affords little intuition concerning measurability. This concept comes into its own when you study collections of random variables--stochastic processes. In turn, the simplest stochastic processes (such as finite discrete Binomial random walks) provide an interpretable setting in which the sigma-algebra generated by all variables $X_0, X_1, \ldots, X_t$ can be thought of as "the information available up to (and including) time $t$." – whuber Nov 07 '17 at 16:26
  • @whuber sorry, I don't understand :) I'd appreciate if you could point me to another answer of yours where you go more in detail, or if you would like to expand this as answer. Otherwise don't worry - maybe I don't know enough about stochastic processes to get your point. Altough..I need to hone my Dynamic Bayesian Network skills, so if this intuition helps when working on time series, I'd be quite interested. – DeltaIV Nov 07 '17 at 16:32
  • 2
    See https://stats.stackexchange.com/a/123754/919. Also helpful might be https://stats.stackexchange.com/a/164995/919 and https://stats.stackexchange.com/a/74339/919. – whuber Nov 07 '17 at 18:28

2 Answers2

38

Consider a random variable $X$. We know that $X$ is nothing but a measurable function from $\left(\Omega, \mathcal{A} \right)$ into $\left(\mathbb{R}, \mathcal{B}(\mathbb{R}) \right)$, where $\mathcal{B}(\mathbb{R})$ are the Borel sets of the real line. By definition of measurability we know that we have

$$X^{-1} \left(B \right) \in \mathcal{A}, \quad \forall B \in \mathcal{B}\left(\mathbb{R}\right)$$

But in practice the preimages of the Borel sets may not be all of $\mathcal{A}$ but instead they may constitute a much coarser subset of it. To see this, let us define

$$\mathcal{\Sigma} = \left\{ S \in \mathcal{A}: S = X^{-1}(B), \ B \in \mathcal{B}(\mathbb{R}) \right\}$$

Using the properties of preimages, it is not too difficult to show that $\mathcal{\Sigma}$ is a sigma-algebra. It also follows immediately that $\mathcal{\Sigma} \subset \mathcal{A}$, hence $\mathcal{\Sigma}$ is a sub-sigma-algebra. Further, by the definitions it is easy to see that the mapping $X: \left( \Omega, \mathcal{\Sigma} \right) \to \left( \mathbb{R}, \mathcal{B} \left(\mathbb{R} \right) \right)$ is measurable. $\mathcal{\Sigma}$ is in fact the smallest sigma-algebra that makes $X$ a random variable as all other sigma-algebras of that kind would at the very least include $\mathcal{\Sigma}$. For the reason that we are dealing with preimages of the random variable $X$, we call $\mathcal{\Sigma}$ the sigma-algebra induced by the random variable $X$.

Here is an extreme example: consider a constant random variable $X$, that is, $X(\omega) \equiv \alpha$. Then $X^{-1} \left(B \right), \ B \in \mathcal{B} \left(\mathbb{R} \right)$ equals either $\Omega$ or $\varnothing$ depending on whether $\alpha \in B$. The sigma-algebra thus generated is trivial and as such, it is definitely included in $\mathcal{A}$.

Hope this helps.

JohnK
  • 18,298
  • 10
  • 60
  • 103
  • 3
    $\mathcal{A}$ is the set of events, right? The one I denoted with $\mathcal{F}$ – DeltaIV Nov 07 '17 at 16:24
  • 3
    Yes, I was born with the condition of finding $\mathcal{A}$ more appealing than $\mathcal{F}$. – JohnK Nov 07 '17 at 16:26
  • 3
    excellent! Very clear. You should write a book :) – DeltaIV Nov 07 '17 at 16:26
  • Hi @JohnK, I have a question if you don't mind: in your following statement "But in practice the preimages of the Borel sets may not be all of $\mathcal{A}$." How can we prove this? The follow up statement of yours "It also follows immediately that [...] $\Sigma$ a sub-sigma-algebra." wasn't an enough of a clarification for me. I apologize and thanks in advance. – Goldman Clarck May 06 '20 at 08:51
  • @GoldmanClarck - "How can we prove this?" You can't. Paraphrasing, John was saying 'for most random variables that we come across in the wild, the preimages of the Borel sets is probably not all of $\mathcal A$'. This statement is too imprecise to prove. John does give one extreme example as evidence (the constant r.v.). – James Dec 15 '20 at 06:35
  • Elements of a set can be sets, think of the elements of the powerset of some set as an example. – confused student Aug 13 '21 at 10:14
  • @JohnK Hi, regarding the last line of your explanation. You sand if a random variable is a constant. $X(\alpha)=c$, the sigma-algebra generated by this random variable can only be ${\Omega, \varnothing}$, so random variable taking different constant value will have the same preimage, either $\Omega$ or $\varnothing$... is that what you mean? Also it reminds me of the indicator function, if $\omega \in A$, $X(\omega)=1$ else, 0, the sigma-algebra generated by this random variable will be $\mathcal{F}=\{\Omega, A, A^c, \varnothing\}$. – JoZ Oct 25 '21 at 13:42
  • @JohnK does this imply the __form__ of sigma-algebra generated by the random variable, entirely depends on the __form__ of the random variable? (not sure I am using the correct word). What if the random variable is a line? What about a curve? What about the bell-shaped curve? Will there be corresponding __form__ of the sigma-algebra like the one for constant and indicator function? – JoZ Oct 25 '21 at 13:47
2

I will attempt to illustrate the intuition from a different perspective, less technically detailed.

Assume 4 random variables $X_1,X_2,X_3$ and $Y=f(X_1,X_2)$ for an arbitrary function $f$. Notice that $Y$ is random, but it's determined completely for fixed $X_1, X_2$, while $X_3$ is not determined for fixed $X_1, X_2$. In other words,

the randomness in $Y$ is exclusively due to $X_1$ and $X_2$.

Can we express that formally without referencing the function $f$?

This is precisely what the notion of the $\sigma$-algebra generated by a random variable captures. Informally, we could say that $\sigma(X)$ restricts the world's probabilism to just $X$, disabling any other source of randomness. In the example above, $\sigma((X_1,X_2))$ contains $\sigma(Y)$ (or $Y$ is $\sigma((X_1,X_2))$-measurable), because the randomness of $(X_1,X_2)$ contains the randomness of $Y$. The converse would be true only if $f$ is a one-to-one mapping.