Expectation of a function of a random variable from CDF

Question

Is it possible to calculate the expectation of a function of a random variable with only the the r.v.'s CDF? Say I have a function $g(x)$ that has the property $\int_{-\infty}^{\infty}g(x)dx \leq \infty $ and the only information I have about the random variable is the CDF.

For example, I have a scenario where there are three timers that can be modeled as exponential random variables $X_1,X_2,X_3$ with rate parameters $\lambda_1,\lambda_2,\lambda_3$ respectively. For each moment in time I earn a reward according to some reward function $g(x)$. That is, my reward for waiting until time $t$ can be written as $\int_0^tg(x)dx$. However, $g(x)$ experiences diminishing returns so the the marginal reward received from waiting one second at $t=0$ is greater than one second at say $t=27$. This 'game' ends when one of two things happens. Either both timers $X_1$ or $X_2$ must ring or timers $X_1$ or $X_3$ must ring. I'm trying to find the expected reward of playing this game.

Currently I can calculate the the CDF of the random variable modeling the time until the game ends, but I dont know how to use this information to when what I really need is reward associated with this time.

So far I have the additional random variables: $$ W_{12}=\max(X_1,X_2) \quad W_{13}=\max(X_1,X_3) \quad Z=\min(W_{12},W_{13})$$ Also let $F_i(x), i\in \{1,2,3\}$ denote the CDF of $X_i$ The CDF of $Z$, can be written as: $$F_Z(t) = F_1(t)F_2(t) + F_1(t)F_3(t) - F_1(t)F_2(t)F_3(t)$$

I know when a random variable takes on non-negative values, you can use a shortcut to calculate the expectation using the CDF. That is, $E[X] = \int_0^\infty F(X\geq x)dx$. Is there something similar I could use for a function of a random variable, or is it necessary to compute the pdf of $Z$ first to compute $\int_0^\infty g(t)f_z(t)dx$

What do you mean by "only" information? The CDF tells you *everything* about the RV that might be related to expectations! It seems like your underlying issue might have to do with the *computational form* in which the CDF is given to you. Please explain your circumstances. BTW, $E[g(X)]$ can be undefined or infinite even when the integral of $|g|$ is finite. — whuber, Jul 06 '16 at 18:25
I think you are looking for integration by parts https://en.wikipedia.org/wiki/Integration_by_parts — seanv507, Jul 06 '16 at 18:56

score 16 · Accepted Answer · edited Jun 11 '20 at 14:32

16

When $F$ is the CDF of a random variable $X$ and $g$ is a (measurable) function, the expectation of $g(X)$ can be found as a Riemann-Stieltjes integral

$$\mathbb{E}(g(X)) = \int_{-\infty}^\infty g(x) dF(x).$$

This expresses the Law of the Unconscious Statistician.

If $g$ is also differentiable, write $dF = -d(1-F)$ and integrate by parts to give

$$\mathbb{E}(g(X)) = -g(x)(1-F(x)){\big|}_{-\infty}^\infty + \int_{-\infty}^\infty (1-F(x)) g^\prime(x)\, \text{d}x$$

provided both addends converge. This means several things, which may be simply expressed by breaking the integral at some definite finite value such as $0$:

${\lim}_{x\to -\infty} g(x)(1-F(x))$ and ${\lim}_{x\to \infty} g(x)(1-F(x))$ exist and are finite. If so, the first addend is the difference of these two.
$\lim_{t\to -\infty} \int_t^0 (1-F(x))g^\prime(x)\,\text{d}x$ and $\lim_{t\to \infty} \int_0^t (1-F(x))g^\prime(x)\,\text{d}x$ exist and are finite. If so, the second addend is the sum of these two.

A good place to break the integral is at any zero of $g$, because--provided $g$ eventually decreases fast enough for large $|x|$--that causes the first addend to vanish, leaving only the integral of $g^\prime$ against the survival function $1-F$.

Example

The expectation of a non-negative variable $X$ is obtained by applying the formula to the identity function $g(x)=x$ for which $g^\prime(x)=1$ and utilizing the fact that the integration may begin at zero:

$$\mathbb{E}(X) = -x(1-F(x))\big|_{0}^\infty + \int_{0}^\infty (1-F(x))\,\text{d}x.$$

Provided $\lim_{x\to\infty} x (1-F(x)) = 0$ (that is, the survival function does not have an overly heavy tail), the upper limit of the first term vanishes. Its lower limit obviously vanishes. We are left only with the integral, giving the expression in the question.

edited Jun 11 '20 at 14:32

Community

1

answered Jul 06 '16 at 20:34

whuber

281,159
54
637
1,101

Thanks, this looks exactly like I wanted. I just need to read up on my Riemann-Stieltjes integration now. – CoconutBandit Jul 06 '16 at 21:22
In your application, because $F$ is continuously differentiable everywhere except at $0$, you can break the integral at $0$ into two Riemann integrals and ignore the complications altogether. – whuber Jul 06 '16 at 21:43
What do you mean by 'complications'? Also, in your second point, should $\int_t^0(1-F(x))g(x)dx$ be $\int_t^0(1-F(x))g'(x)dx$? If not, why did the $g'(x)$ change to $g(x)$? – CoconutBandit Jul 07 '16 at 13:58
(1) Thank you, those primes needed to be there. (2) "Complications" refers to needing the Riemann-Stieltjes integral instead of the Riemann integral. – whuber Jul 07 '16 at 14:01
Is this a bit of a circular argument? Isn't the _proof_ of the Law of the Unconscious Statistician based on the result that for a nonnegative random variable $X$, $$E[X] = \int_0^\infty [1-F_X(x)]\,\mathrm dx \quad ??$$ (e.g. the Wikipedia link that you cite for LOTUS refers to a page on math.uah,edu for the proof of LOTUS that begins with the displayed result above. – Dilip Sarwate Jul 07 '16 at 15:18
@Dilip Not at all. LOTUS asserts that the Lebesgue integral of $x \text{d}F_X(x)$ over $\mathbb{R}$ is the same as the Lebesgue integral of $X$ over some (original, implicit) probability space $(\Omega, \mathfrak{S}, \mathbb{P})$. – whuber Jul 07 '16 at 16:43
Im still having trouble with the step: $dF = -d(1-F)$. Why is this true? Is this a property of RS-integrals that I'm unfamiliar with? – CoconutBandit Jul 11 '16 at 13:24
1

This uses three basic rules of differentiation: the sum rule, the product rule, and the fact that constants have zero derivatives. – whuber Jul 11 '16 at 15:14
Why/how does the lower limit obviously vanish once $x \le 0$? To me the natural extension of $F(x)$ below the the lower bound of the domain of $x$ is that $F(x)=0$, thus $(1-F)=1$ in that region of the real line, and as $x \rightarrow -\infty$ the lower-limit part of the boundary term diverges. – Dave Sep 27 '17 at 19:57
Previous comment is about the example calculation of $E[x]$ btw. – Dave Sep 27 '17 at 20:12
@Dave Because the calculation is for a non-negative variable. However, my exposition of that is not as clear as it could be, so let's see whether I can make it better. – whuber Sep 27 '17 at 20:18
Your fix brings up the point that I made on another question: you can only get the boundary term to drop out if the lower bound of the domain of integration is 0. Note that for a distribution such that $dF(x)>0$ only for $0 – Dave Sep 27 '17 at 20:34
2

@Dave Why does that matter? For a variable with a lower bound, you get a boundary term, so include it. For a variable with no lower bound, sum the contributions from its positive part and negative part separately. Indeed, the same method works for any variable. Thus, $E[X]=\int_0^\infty (1-F(x))dx - \int_{-\infty}^0 F(x) dx$ whenever the expectation exists. This seems not even worth discussing, because all we're debating is how one goes about evaluating an integral *via* integration by parts. That's scarcely controversial. – whuber Sep 27 '17 at 20:36
This is nice. But you can give a reference for it? – ablmf Jul 12 '19 at 11:29
@ablmf Consult any account of integration by parts. – whuber Jul 12 '19 at 11:49
@whuber why is this necessary $dF=−d(1−F)$ ? would you get the same result in terms of $F$ as opposed to $1-F$ if you don'y carry out that substitution? – dleal Feb 03 '20 at 18:21
@dleal You can, but the point made at the end of this post is that many applications (including the one in the question) focus on positive values. There, the integral of $1-F$ converges (provided $X$ has an expectation) but the integral of $F$ diverges for all $X.$ – whuber Feb 03 '20 at 19:06
@whuber thank you, that clears it up – dleal Feb 03 '20 at 21:40

Expectation of a function of a random variable from CDF

1 Answers1

Example

Linked

Related