38

Lets say we have random variable $X$ with known variance and mean. The question is: what is the variance of $f(X)$ for some given function f. The only general method that I'm aware of is the delta method, but it gives only aproximation. Now I'm interested in $f(x)=\sqrt{x}$, but it'd be also nice to know some general methods.

Edit 29.12.2010
I've done some calculations using Taylor series, but I'm not sure whether they are correct, so I'd be glad if someone could confirm them.

First we need to approximate $E[f(X)]$
$E[f(X)] \approx E[f(\mu)+f'(\mu)(X-\mu)+\frac{1}{2}\cdot f''(\mu)(X-\mu)^2]=f(\mu)+\frac{1}{2}\cdot f''(\mu)\cdot Var[X]$

Now we can approximate $D^2 [f(X)]$
$E[(f(X)-E[f(X)])^2] \approx E[(f(\mu)+f'(\mu)(X-\mu)+\frac{1}{2}\cdot f''(\mu)(X-\mu)^2 -E[f(X)])^2]$

Using the approximation of $E[f(X)]$ we know that $f(\mu)-Ef(x) \approx -\frac{1}{2}\cdot f''(\mu)\cdot Var[X]$

Using this we get:
$D^2[f(X)] \approx \frac{1}{4}\cdot f''(\mu)^2\cdot Var[X]^2-\frac{1}{2}\cdot f''(\mu)^2\cdot Var[X]^2 + f'(\mu)^2\cdot Var[X]+\frac{1}{4}f''(\mu)^2\cdot E[(X-\mu)^4] +\frac{1}{2}f'(\mu)f''(\mu)E[(X-\mu)^3]$
$D^2 [f(X)] \approx \frac{1}{4}\cdot f''(\mu)^2 \cdot [D^4 X-(D^2 X)^2]+f'(\mu)\cdot D^2 X +\frac{1}{2}f'(\mu)f''(\mu)D^3 X$

Tomek Tarczynski
  • 3,854
  • 7
  • 29
  • 37
  • Delta method is used for asymptotic distributions. You cannot use when you have only one random variable. – mpiktas Dec 28 '10 at 14:35
  • @mpiktas: Actually I dont know much about Delta method, I've just read something on wikipedia. This is quotation from wiki: "The delta method uses second-order Taylor expansions to approximate the variance of a function of one or more random variables". – Tomek Tarczynski Dec 29 '10 at 13:08
  • it seems wikipedia has exactly what you want: http://en.wikipedia.org/wiki/Taylor_expansions_for_the_moments_of_functions_of_random_variables. I will reedit my answer, it seems that I underestimated Taylor expansion. – mpiktas Dec 29 '10 at 13:39
  • Tomek, if you disagree with the edits that were made (not by me), you can always change them again, or roll them back, or just point out the differences and ask for clarification. – Glen_b May 11 '13 at 22:16
  • 2
    @Glen_b: I agree with them E(X-mu) = 0 doesn't implyt that E[(X-mu)^3] = 0. – Tomek Tarczynski May 12 '13 at 07:32
  • @mpiktas: what do you mean by saying that delta method is used for asymptotic distributions and not only one random variable? I use delta method all the time for example when I'm to find variance of exp(x), where x is the random variable. Thanks! – HeyJane Oct 04 '17 at 08:32
  • @HeyJane My first encounter with delta method was in a book "Asymptotic Statistics" by A. van der Vaart. I wrongly assumed that you can only use it in asymptotic setting. My anwer clearly shows that this is not the case. Although I still do not get it why it is called delta method, when this is basic application of Taylor expansion formula. – mpiktas Oct 05 '17 at 14:19

2 Answers2

38

Update

I've underestimated Taylor expansions. They actually work. I assumed that integral of the remainder term can be unbounded, but with a little work it can be shown that this is not the case.

The Taylor expansion works for functions in bounded closed interval. For random variables with finite variance Chebyshev inequality gives

$$P(|X-EX|>c)\le \frac{\operatorname{Var}(X)}{c}$$

So for any $\varepsilon>0$ we can find large enough $c$ so that

$$P(X\in [EX-c,EX+c])=P(|X-EX|\le c)<1-\varepsilon$$

First let us estimate $Ef(X)$. We have \begin{align} Ef(X)=\int_{|x-EX|\le c}f(x)dF(x)+\int_{|x-EX|>c}f(x)dF(x) \end{align} where $F(x)$ is the distribution function for $X$.

Since the domain of the first integral is interval $[EX-c,EX+c]$ which is bounded closed interval we can apply Taylor expansion: \begin{align} f(x)=f(EX)+f'(EX)(x-EX)+\frac{f''(EX)}{2}(x-EX)^2+\frac{f'''(\alpha)}{3!}(x-EX)^3 \end{align} where $\alpha\in [EX-c,EX+c]$, and the equality holds for all $x\in[EX-c,EX+c]$. I took only $4$ terms in the Taylor expansion, but in general we can take as many as we like, as long as function $f$ is smooth enough.

Substituting this formula to the previous one we get

\begin{align} Ef(X)&=\int_{|x-EX|\le c}f(EX)+f'(EX)(x-EX)+\frac{f''(EX)}{2}(x-EX)^2dF(x)\\\\ &+\int_{|x-EX|\le c}\frac{f'''(\alpha)}{3!}(x-EX)^3dF(x) +\int_{|x-EX|>c}f(x)dF(x) \end{align} Now we can increase the domain of the integration to get the following formula

\begin{align} Ef(X)&=f(EX)+\frac{f''(EX)}{2}E(X-EX)^2+R_3\\\\ \end{align} where \begin{align} R_3&=\frac{f'''(\alpha)}{3!}E(X-EX)^3+\\\\ &+\int_{|x-EX|>c}\left(f(EX)+f'(EX)(x-EX)+\frac{f''(EX)}{2}(x-EX)^2+f(X)\right)dF(x) \end{align} Now under some moment conditions we can show that the second term of this remainder term is as large as $P(|X-EX|>c)$ which is small. Unfortunately the first term remains and so the quality of the approximation depends on $E(X-EX)^3$ and the behaviour of third derivative of $f$ in bounded intervals. Such approximation should work best for random variables with $E(X-EX)^3=0$.

Now for the variance we can use Taylor approximation for $f(x)$, subtract the formula for $Ef(x)$ and square the difference. Then

$E(f(x)-Ef(x))^2=(f'(EX))^2\operatorname{Var}(X)+T_3$

where $T_3$ involves moments $E(X-EX)^k$ for $k=4,5,6$. We can arrive at this formula also by using only first-order Taylor expansion, i.e. using only the first and second derivatives. The error term would be similar.

Other way is to expand $f^2(x)$: \begin{align} f^2(x)&=f^2(EX)+2f(EX)f'(EX)(x-EX)\\\\ &+[(f'(EX))^2+f(EX)f''(EX)](X-EX)^2+\frac{(f^2(\beta))'''}{3!}(X-EX)^3 \end{align}

Similarly we get then \begin{align*} Ef^2(x)=f^2(EX)+[(f'(EX))^2+f(EX)f''(EX)]\operatorname{Var}(X)+\tilde{R}_3 \end{align*} where $\tilde{R}_3$ is similar to $R_3$.

The formula for variance then becomes \begin{align} \operatorname{Var}(f(X))=[f'(EX)]^2\operatorname{Var}(X)-\frac{[f''(EX)]^2}{4}\operatorname{Var}^2(X)+\tilde{T}_3 \end{align} where $\tilde{T}_3$ have only third moments and above.

joriki
  • 443
  • 2
  • 6
mpiktas
  • 33,140
  • 5
  • 82
  • 138
  • I dont need to know the exact value of the variance, approximation should works for me. – Tomek Tarczynski Dec 29 '10 at 13:10
  • Indeed, the approximate formula for $\mathbb{E}[f(X)]$ in the OP is often used in risk analysis in economics, finance and insurance. – Raskolnikov Dec 29 '10 at 14:50
  • @Raskolnikov, yes but it contradicts my admitedly stale knowledge of Taylor expansion. Clearly the remainder term must be taken into account. If the random variable is bounded, then no problem, since polynomials approximate continuous functions on bounded interval uniformly. But we deal with unbounded random variables. Of course for random normal we can say that it is effectively bounded, but still in general case, some nasty surprises can arise, or not. I will fix my answer when I'll have the clear answer. – mpiktas Dec 29 '10 at 15:11
  • @mpiktas: I've done some simulations and accuracy of the approximation strongly depends on the distribution, for unimodal distribution it seems to work fine, but the error could be significant for uniform distribution. I was testing $f(x)=\sqrt{x}$ and in case of $\chi^{2}$ the relative error was about 3%, but in case of uniform distribution it was more than 10%. I'm talking about the variance, because mean is approximated far better. – Tomek Tarczynski Dec 29 '10 at 15:46
  • 3
    @Tomek Tarczynski, the third derivative of $\sqrt{x}$ goes to zero quite quickly for large $x$, but is unbounded near zero. So if you picked uniform distribution with support close to zero, the remainder term can get large. – mpiktas Dec 30 '10 at 09:04
  • @mpiktas: Great answer, You put a lot of effort in it. I'm very grateful. – Tomek Tarczynski Dec 30 '10 at 09:09
  • @Tomek Tarczynski, you're welcome. – mpiktas Dec 30 '10 at 09:31
  • For $f(x)=\sqrt x$, why delta method will not work? As far I know, the formula of delta method for one random variable is: $V[f(X)]=[\frac{d}{dx}f(x)]^2 V(X)$. If this is correct, then isn't for $f(x)=\sqrt x$, $V[f(X)]=[\frac{d}{dx}\sqrt x]^2 V(X)=[\frac{1}{2\sqrt x}]^2 V(X)=[\frac{1}{4 x}] V(X)$ ? – user 31466 Jun 27 '17 at 14:00
  • The reference of the formula of delta method for one random variable, $V[f(X)]\approx[\frac{d}{dx}f(x)]^2 V(X)$, is: http://onlinelibrary.wiley.com/doi/10.1002/9781118307656.app1/pdf equation (A.4). – user 31466 Jun 27 '17 at 14:13
  • 1
    Note that in your link the the equality is approximate. In this answer all the equations are exact. Furthermore for the variance note that the first derivative is estimated at the $EX$, not $x$. Also I never stated that this will not work for $\sqrt{x}$, only that for $\sqrt{x}$ the approximate formula might have huge error if $X$ domain is close to zero. – mpiktas Jun 28 '17 at 08:55
  • 1
    In your answer, you wrote \begin{align} f(x)=f(EX)+f'(EX)(x-EX)+\frac{f''(EX)}{2}(x-EX)^2+\frac{f'''(\alpha)}{3}(x-EX)^3 \end{align}. But isn't the denominator of the last term is $3!$? – user 31466 Jun 28 '17 at 13:50
  • @Leaf, yes sure. – mpiktas Jul 17 '17 at 10:57
9

To know the first two moments of X (mean and variance) is not enough, if the function f(x) is arbitrary (non linear). Not only for computing the variance of the transformed variable Y, but also for its mean. To see this -and perhaps to attack your problem- you can assume that your transformation function has a Taylor expansion around the mean of X and work from there.

leonbloy
  • 1,440
  • 8
  • 15