12

Example: Toss a coin twice. Letting $\mathbb P$ be a probability measure, suppose $\mathbb P(HH)=p^2,\mathbb P(HT)=\mathbb P(TH)=p(1-p), \mathbb P(TT)=(1-p)^2.$ I would like to answer the following questions:

  1. Define $Y$ to be the number of heads in the example. Derive the $\sigma$-field generated by $Y$.

  2. Find the expectation of the random variable $Y$ from the previous exercise, and also the conditional expectation of $Y$ given $\mathcal F_{1}$ s.t $\mathcal F_{1}=\{\varnothing,\{HH,HT\},\{TH,TT\},\Omega\}$. Check that in this case $E[E[Y|\mathcal F_{1}]]=E[Y]$.

My answers: the sample space is $\Omega=\{HH,HT,TH,TT\}$. I think the answer to 1 is $Y=\{\{TT\},\{HT,TH\},\{HH\}\}$, the $\sigma$-field generated by $Y$ is $\mathcal F(Y)=\{\varnothing,\{TT\},\{HH,HT,TH\},\{HT,TH\},\{HH,TT\},\{HH\},\{HT,TH,TT\},\Omega\}.$

But for question 2 I could only get $E(Y)=2p$. How is the conditional expectation computed?

whuber
  • 281,159
  • 54
  • 637
  • 1,101
pual ambagher
  • 739
  • 5
  • 13

2 Answers2

13

The formal definition of conditional expectation is that $E[Y|\mathcal{F}_1]$ is any random variable measurable with respect to $\mathcal{F}_1$ having the property that

$$\int_F E[Y|\mathcal{F}_1](\omega)d\mathbb{P}(\omega) = \int_F Y(\omega) d\mathbb{P}(\omega)$$

for all $\mathcal{F}_1$-measurable sets $F$.

In the present case, this definition invites us to inspect all the measurable subsets $F$ with respect to $\mathcal{F}_1$, which you already computed in the first problem. The trick is to begin with the smallest, most basic $\mathcal{F}_1$-measurable sets (apart from the empty set), which are $\{HH, HT\}$ and $\{TH, TT\}$. Although we don't yet know $E[Y|\mathcal{F}_1]$, we can use the right hand side to compute its integrals. Because neither of these events can be decomposed (nontrivially) into smaller ones, the conditional expectation must have a constant value on each one. For example, writing $$E[Y|\mathcal{F}_1](HH) =E[Y|\mathcal{F}_1](HT) = z,$$

the definition gives

$$\eqalign{ zp &= zp^2 + zp(1-p) \\ &=E[Y|\mathcal{F}_1](HH) \mathbb{P}(HH) +E[Y|\mathcal{F}_1](HT) \mathbb{P}(HT)\\ &=\int_{\{HH, HT\}} E[Y|\mathcal{F}_1](\omega)d\mathbb{P}(\omega)\\ &= \int_{\{HH, HT\}} Y(\omega) d\mathbb{P}(\omega)\\ &= Y(HH)\mathbb{P}(HH) + Y(HT)\mathbb{P}(HT) \\ &= 2p^2 + 1p(1-p)= p+p^2, }$$

whence we deduce

$$z = \frac{p+p^2}{p} = 1 + p.$$

A similar calculation for $F = \{TH, TT\}$ (do it!) establishes that

$$E[Y|\mathcal{F}_1](TH) =E[Y|\mathcal{F}_1](TT) = p.$$

There is a simple intuition to support these abstract calculations: $\mathcal{F}_1$ records the information available after flipping the coin the first time. If it comes up heads, the possible events (which have only partially occurred!) are $HH$ and $HT$. We already have one head and there is a chance $p$ that the second flip will be a head. Thus, at this stage, our expectation of $Y$ equals $1$ (for what has happened) plus $1\times p$ (for what could happen yet), summing to $1+p$. If instead the first flip is tails, we have seen no heads yet but there is still a chance of $p$ of seeing a head on the second flip: the expectation of $Y$ is just $0 + 1\times p = p$ in that case.

As a check, we can compute

$$\eqalign{ E[E[Y|\mathcal{F}_1]] &= \int_\Omega E[Y|\mathcal{F}_1](\omega)d\mathbb{P}(\omega) \\ &= (1+p)\mathbb{P}(E[Y|\mathcal{F}_1]=1+p) + (p)\mathbb{P}(E[Y|\mathcal{F}_1]=p)\\ & = (1+p)\mathbb{P}(\{HH, HT\}) + (p) \mathbb{P}(\{TH, TT\})\\ &= (1+p)(p^2 + p(1-p)) + p((1-p)p + (1-p)^2)\\ &= 2p, }$$

exactly as before.

It should be clear that this is just a laborious way of expressing the idea that there is a $p$ chance at the outset of heads--which has a conditional expectation of $1+p$-- and a $1-p$ chance of tails--which has a conditional expectation of $p$: everything reduces to the elementary calculation of conditional expectations, which needs no sigma fields or integrals. The point of this exercise is to build on that intuition to develop an understanding that will hold up when these sigma algebras get much, much more complicated.

whuber
  • 281,159
  • 54
  • 637
  • 1,101
  • (+1) You've beaten me on this! – Zen Nov 01 '13 at 21:45
  • @Zen Sorry about the duplication (and +1 to yours since it's almost identical :-). I try to avoid answering routine questions, because so many people here are so helpful that there's no need to jump in. However, an answer by someone faster than either of us, since deleted, had confused the issues so much I felt obliged to clear the air a little. – whuber Nov 01 '13 at 22:43
  • 2
    I also wrote my answer because that deleted answer bothered me a little bit. A sub-sigma-field represents partial information, conditional expectations are measurable functions. We all should write this on the mirror. I like your intuitive explanation at the end – Zen Nov 02 '13 at 00:49
  • 1
    @whuber, there is one small part of this that I don't understand: *the conditional expectation must have a constant value on each one* - what does that mean? – makansij Oct 27 '18 at 18:38
  • 1
    @Hunle "One" refers back to "events." An event is a set; the conditional expectation is a function; *ergo*, it must have the same value on each element of the event. – whuber Oct 27 '18 at 18:55
  • Okay, thank you @whuber. I guess my confusion is around the following: I understood that random variables are functions that map outcomes to real numbers. Here, you seem to be saying that the conditional expectation maps events to real numbers? And thus the any outcomes within the same event must map to the same real number? Where it seems like in your answer uses the sigma algebra $Y=\{\{HT\}\}$, the $\sigma$-field generated by $Y$ is the unions and compleemnts of its elements, as well as the empty set, and $\Omega$: $\mathcal F(Y)=\{\varnothing, \{HT,HH\},\{HH\},\{HT\},\Omega\}.$ – makansij Oct 28 '18 at 16:26
  • @Hunle I'm trying hard not to say that, as the first line of my answer emphasizes. The conditional expectation is a random variable; and that means it is a function defined on the sample space. – whuber Oct 28 '18 at 17:43
  • is it always the case that the $ E[Y| \mathcal{F}_1] (\omega) $ are equal, among the $\omega \in A$ for a set $A \in \mathcal{F}_1$? Is this just by definition? – itzjustricky Dec 26 '18 at 21:30
  • @itzjustricky Not quite. After all, $\Omega\in\mathcal{F}_1$ but we cannot insist that the conditional expectations be constant on $\Omega$! What is important, according to the definition of measurability, is that the inverse image of any Borel set in $\mathbb{R}$ be an element of $\mathcal{F}_1.$ In particular, since singletons $\{x\}$ are Borel sets, for all extended real $x$ the collection $\{\omega\in\Omega\mid E[Y\mid \mathcal{F}_1](\omega)=x\}$ must be an element of $\mathcal{F}_1.$ – whuber Dec 26 '18 at 23:45
  • 1
    @whuber right of course! Then I don't quite understand why we have the following: $E[Y|\mathcal{F}_1](HH) =E[Y|\mathcal{F}_1](HT) = z,$. I can see that the events cannot be further decomposed, but why must the 2 be equal? – itzjustricky Dec 27 '18 at 02:27
  • @itztricky Because in every element of $\mathcal{F}_1,$ either both of HH, HT are in it or both are not in it. *Ergo*, all measurable functions must have the same values on both outcomes. You might be overthinking this. Perhaps it would help to know that this is not a statistical argument--it is just a basic set theory deduction. – whuber Dec 27 '18 at 13:45
  • @whuber I actually understand it now! thank you for your comments, very helpful. – itzjustricky Dec 27 '18 at 21:29
7

Take the sample space $\Omega$ as the Cartesian product $\{H,T\}\times\{H,T\}$, with sigma-field $\mathscr{F}$ equal to the class of all subsets of $\Omega$. The sigma-field generated by $Y$ (denoted by $\sigma(Y)$) is the smallest sub-sigma-field of $\mathscr{F}$ in which $Y$ is measurable. Since $Y\in\{0,1,2\}$, the inverse images $$ Y^{-1}(\{0\}) = \{(T,T)\}\, , \quad Y^{-1}(\{1\}) = \{(H,T),(T,H)\}\, , \quad \quad Y^{-1}(\{2\}) = \{(H,H)\} \, , $$ show that $$ \sigma(Y)=\sigma\left\{\{(T,T)\}, \{(H,T),(T,H)\}, \{(H,H)\}\right\} = \left\{\emptyset,\{(T,T)\}, \{(H,T),(T,H)\}, \{(H,H)\},\{(H,T),(T,H),(H,H)\},\{(T,T),(H,H)\},\{(T,T),(H,T),(T,H)\},\Omega\right\} \, . $$ Define $$ \mathscr{G}=\{\emptyset,\{(H,H),(H,T)\},\{(T,T),(T,H)\},\Omega\}\subset\mathscr{F} \, . $$

The conditional expectation $Z=\mathrm{E}[Y\mid \mathscr{G}]$ is a $\mathscr{G}$-measurable random variable satisfying $$ \mathrm{E}[Y I_A]=\mathrm{E}[Z I_A] \, , \qquad\qquad (*) $$ for every $A\in\mathscr{G}$. The fact that $Z$ is $\mathscr{G}$-measurable entails that it is constant in the atoms of $\mathscr{G}$ (this is the crucial idea). Let $$ Z(H,H) = Z(H,T) = a, \quad Z(T,T) = Z(T,H) = b \, . $$ Taking $A=\{(H,H),(H,T)\}$, relation $(*)$ yields $$ 2 \cdot p^2 + 1 \cdot p(1-p) = a \cdot p^2 + a \cdot p(1-p) \, , $$ implying that $a=1+p$. Similarly, we find $b=p$.

Finally, $$ \mathrm{E}[Y]= 1 \cdot 2 p(1-p) + 2\cdot p^2 = 2p \, , $$ and $$ \mathrm{E}[Z] = (1+p) \cdot (p^2 + p(1-p)) + p \cdot ((1-p)^2 + p(1-p)) = 2p \, , $$ as "expected".

P.S. May I suggest a beautiful related exercise? Let $\Omega=[0,1]$, $\mathscr{F}$ be the Borel subsets of $\Omega$, and $P$ be Lebesgue measure. Let $\mathscr{G}$ be the sub-sigma-field of $\mathscr{F}$ generated by the partition $$\{[0,1/2],(1/2,1]\}\, . $$ Let $X$ be the identity map ($X(\omega)=\omega$). Plot the graph of $\mathrm{E}[X\mid\mathscr{G}]$.

Zen
  • 21,786
  • 3
  • 72
  • 114
  • Isn't it the case that $E[X|\mathscr{G}]$ is unique in your final exercise, because $\mathscr{G}$ is finite? There can be no question of changing $E[X|\mathscr{G}]$ on a nonempty set of measure zero w.r.t. $\mathscr{G}$. – whuber Nov 01 '13 at 22:45