Covariance definition

Question

Why is covariance defined the way it is? $$\sigma(x,y)=\mathbb{E}[(X-\mathbb{E}[X])(Y-\mathbb{E}[Y])]$$ How do we know that this definition behaves in the following way?

Covariance is a measure of how much two random variables change together. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the smaller values, i.e., the variables tend to show similar behavior, the covariance is positive. In the opposite case, when the greater values of one variable mainly correspond to the smaller values of the other, i.e., the variables tend to show opposite behavior, the covariance is negative. The sign of the covariance therefore shows the tendency in the linear relationship between the variables.

Is there any justification for correctness of this definition or history of its development? Do we just take this interpretation as an axiom? Obviously definitions cannot be wrong, but still they might somehow not agree with our intentions on how they are supposed to work.

There is a related discussion at http://stats.stackexchange.com/questions/101324. The covariance is not the only property of a bivariate distribution that has all these characteristics. The final conclusion in the quotation is a *non-sequitur* because the foregoing properties do not characterize only *linear* relationships. — whuber, Mar 19 '15 at 19:00
Some possibly relevant answers at http://stats.stackexchange.com/questions/18058/ — Juho Kokkala, Mar 19 '15 at 19:04

score 0 · Answer 1 · answered Mar 19 '15 at 18:25

0

Hopefully my answer is not circular, but look carefully at the equation. You are multiplying the deviation of x by the deviation in y, and then averaging the result. This average product of simultaneous deviation results in the measure and definition shown above.

Compare to the definition of variance to see why the term co-variance rightly applies: it's the same shape of definition with a minor substitution.

answered Mar 19 '15 at 18:25

RegressForward

1,254
7
13

2

I'm curious why people use the word 'average' to describe expected value. Expected value is equal to arithmetic mean only when the probabilities of all random variable values are equal. – user216094 Mar 19 '15 at 18:34
You can take a look at LAD regression or quantile regression of you're interested in other measures of center. They have obstacles which have led to different developmental paths in the history if statistics even though they remain viable. – RegressForward Mar 19 '15 at 18:39
2

In the [tickets-in-a-box model of random variables](http://stats.stackexchange.com/questions/50/what-is-meant-by-a-random-variable/54894#54894), the expectation *is* the arithmetic mean. A more rigorous way to say this is that the expectation of any empirical distribution is its arithmetic mean. You can appeal to limiting properties of empirical distributions (laws of large numbers) to extend this justification to all distributions. – whuber Mar 19 '15 at 19:03
1

I undestand why it works, but I was asking about something else: how was this formula created? Somebody just started thinking about what mathematical formula would satisfy the description of covariance given in my question and just came up with $\sigma(x,y)=\mathbb{E}[(X-\mathbb{E}[X])(Y-\mathbb{E}[Y])]$? – user216094 Mar 19 '15 at 20:58
That formula has been around since at least the late 17th century: It arises naturally in Newtonian mechanics. – whuber Aug 09 '21 at 21:41

score 0 · Answer 2 · answered Nov 06 '19 at 21:20

0

The variance formula is $Var(X,X) = E(X^2)-E(X)^2 =E(XX)-E(X)E(X)$

So the co-variance would be $Var(X,Y) = E(XY)-E(X)E(Y)$

$$ E(XY) - E(X)E(Y) \\ =E(XY) - E(X)E(Y) - E(X)E(Y)+E(X)E(Y) \\ =E(XY) - XE(Y) - E(X)Y + E(X)E(Y) \\ =E[XY - XE(Y) - E(X)Y + E(X)E(Y)] \\ =E[(X-E(X))(Y-E(Y)] $$

answered Nov 06 '19 at 21:20

Bill Chen

250
2
8

2

This does not appear to address the question, which is why covariance is defined in this way. – Nick Cox Dec 08 '19 at 11:37

Covariance definition

2 Answers2