Suppose $x$ and $y$ are independently distributed according to CDFs $F_x$ and $F_y$ respectively over compact support $[0,1]$. Under what condition is $\operatorname{Cov}(x, \min\{x,y\})\gt 0$?
-
1What does $F$ denote? – wolfies May 24 '17 at 15:32
-
Thank you for the response - F denotes the respective CDFs. I just hoped to say both variables have compact support. I've updated the problem statement. – user341296 May 24 '17 at 15:55
-
Under what condition could the covariance be negative? – wolfies May 24 '17 at 17:07
-
That's my question, I guess. I am conjecturing that the covariance is always positive, but I couldn't prove it. – user341296 May 24 '17 at 17:35
-
1Is it correct that you are assuming that the support of $X$ and $Y$ intersect, and are continuous, as distinct from say, $X \in (\frac12,1)$ and $Y \in (0,\frac12)$ – wolfies May 24 '17 at 17:54
-
1It's definitely not the case that the covariance is always positive. Take, for instance, $X$ with a uniform distribution on $[1/2,1]$ and $Y=1-X$, entailing $\min(X,Y)=Y$, whence $\operatorname{Cov}(X,\min(X,Y))=-\operatorname{Var}(X) \lt 0$. Are you perhaps assuming $X$ and $Y$ are independent? – whuber May 24 '17 at 18:09
-
Nice example! Even if the intention is that $X$ and $Y$ both have strictly positive densities over (0,1), extending your example to $X \sim \text{Uniform}(0,1)$ and taking $Y=1-\sqrt{X}$, the covariance will still be negative. – wolfies May 24 '17 at 18:41
-
Right, I am assuming they are independent. Sorry for not being clear. – user341296 May 24 '17 at 20:20
1 Answers
Here is a sketch of an analysis based on routine, well-known properties of covariance.
Without loss of generality, we may assume $E(X)=0$ by subtracting $E(X)$ from both $X$ and $Y$: this changes none of the second-order quantities that will appear and yields the simplified formulas $\operatorname{Var}(X)=E(X^2)$ and $\operatorname{Cov}(X,Y)=E(XY)$. Now apply standard formulas and break the expectation into the cases $X\le Y$ and $X\gt Y$, writing $p=\Pr(X\le Y)$:
$$\eqalign{ \operatorname{Cov}(X, \min(X,Y)) &= E(X\min(X,Y)) - E(X)E(\min(X,Y)) \\ &= E(X\min(X,Y)) \\ &= E(X\min(X,Y)|\min(X,Y)=X)\Pr(X \le Y) \\&+\quad E(X\min(X,Y)|\min(X,Y)=Y)\Pr(X \gt Y) \\ &= E(X^2)p + E(XY)(1-p) \\ &= \operatorname{Var}(X)p + \operatorname{Cov}(X,Y)(1-p). }$$
These calculations assumed nothing beyond the existence of the variance and covariance. When $X$ and $Y$ are independent, $\operatorname{Cov}(X,Y)=0$. We may conclude
When $\operatorname{Cov}(X,Y)=0$, $\operatorname{Cov}(X, \min(X,Y)) = \operatorname{Var}(X)\Pr(X \le Y),$ which is nonnegative and can equal zero only when $\Pr(X \le Y)=0$ or $\operatorname{Var}(X)=0$.
Geometrical Solution
Recall that the covariance of $X$ and $Y$, when it exists, is the sum (or integral) of positive ("red") rectangle areas minus the sum (or integral) of negative ("blue") rectangle areas, as explained at https://stats.stackexchange.com/a/18200. I will show that when you use your crayons to draw the red and blue regions in depicting the covariance of $X$ and $\min(X,Y)$, the amount of red you use can be no less than the amount of blue. This is done by comparing the picture to the one you would draw for the covariance of $X$ and $Y$ themselves.
To this end, fix $x_1 \lt x_2$. Let $y_1\lt y_2$ be any real numbers. In the calculation of the covariance of $X$ and $Y$, these data contribute two equiprobable rectangles: a red one and a blue one, one with vertices at $(x_1,y_1)$ and $(x_2,y_2)$ and the other with vertices at $(x_1,y_2)$ and $(x_2,y_1)$, respectively. Their areas are negatives of each other. Their contributions therefore cancel in the calculation. Since this is true for all such $x_1,x_2,y_1,$ and $y_2$, the covariance of $X$ and $Y$ must be zero.
Now let $Y$ be replaced by $\min(X,Y)$. This changes some of the rectangles. They both still have the same base $x_2-x_1$, but their heights change when the $y_i$ are replaced by $\min(x_j,y_i)$. The sum of the new signed areas is
$$(x_2-x_1)(\color{red}{[\min(x_2,y_2)-\min(x_1,y_1)]} + \color{blue}{[\min(x_2,y_1)-\min(x_1,y_2)]})$$
(obtained by multiplying each base by its signed height and collecting the common factor $x_2-x_1$ that appears).
Rearranging the colored terms gives
$$(x_2-x_1)(\color{red}{\min(x_2,y_2)}-\color{blue}{\min(x_1,y_2)} + \color{blue}{\min(x_2,y_1)}-\color{red}{\min(x_1,y_1)}).$$
Since $x_1 \lt x_2$, $\min(x_1,y) \le \min(x_2,y)$ for all $y$. Letting $y=y_1$ shows $(1)$ the first difference in the right hand factor is non-negative and letting $y=y_2$ shows $(2)$ the second difference is non-negative. Therefore their sum is non-negative, whence the entire product (when multiplied by the positive value $x_2-x_1$) is non-negative. Since this is true for all $x_1 \lt x_2$,
$\operatorname{Cov}(X, \min(X,Y))$ exceeds $\operatorname{Cov}(X,Y)=0$ by some non-negative amount.
Moreover,
$\operatorname{Cov}(X,\min(X,Y))$ will be strictly positive if there is a positive chance of conditions (1) or (2) above ever occurring. This is equivalent to there being a positive chance that $Y\gt X$.
For independent $X$ and $Y$, and regardless of the compactness of their supports, these results completely characterize when the covariance of $X$ and $\min(X,Y)$ can be $0$ or positive; it can never be negative.
For those who might object that the "crayoning" picture of covariance applies only to finite discrete variables (or datasets), I will simply remark that you may take these pictures as accurate metaphors for the corresponding four-dimensional Lebesgue-Stieltjes integrals. (The integrand is proportional to $(x_2-x_1)(y_2-y_1)$, taken over the tuples $(x_1,x_2,y_1,y_2)$ where $x_1\lt x_2$, relative to the measure $dF_X(x_1)dF_X(x_2)dF_Y(y_1)dF_Y(y_2)$.) The argument presented here applies to those integrals without any change: you are welcome to rewrite it in that form.

- 281,159
- 54
- 637
- 1,101