107

What is covariance in plain language and how is it linked to the terms dependence, correlation and variance-covariance structure with respect to repeated-measures designs?

Macro
  • 40,561
  • 8
  • 143
  • 148
abc
  • 1,747
  • 3
  • 17
  • 32
  • 21
    Also of interest: "[How would you explain covariance to someone who understands only the mean?](http://stats.stackexchange.com/q/18058)" and "[How would you explain the difference between correlation and covariance?](http://stats.stackexchange.com/q/18082)". – caracal Jun 03 '12 at 11:04

2 Answers2

96

Covariance is a measure of how changes in one variable are associated with changes in a second variable. Specifically, covariance measures the degree to which two variables are linearly associated. However, it is also often used informally as a general measure of how monotonically related two variables are. There are many useful intuitive explanations of covariance here.

Regarding how covariance is related to each of the terms you mentioned:

(1) Correlation is a scaled version of covariance that takes on values in $[-1,1]$ with a correlation of $\pm 1$ indicating perfect linear association and $0$ indicating no linear relationship. This scaling makes correlation invariant to changes in scale of the original variables, (which Akavall points out and gives an example of, +1). The scaling constant is the product of the standard deviations of the two variables.

(2) If two variables are independent, their covariance is $0$. But, having a covariance of $0$ does not imply the variables are independent. This figure (from Wikipedia)

$ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ $ enter image description here

shows several example plots of data that are not independent, but their covariances are $0$. One important special case is that if two variables are jointly normally distributed, then they are independent if and only if they are uncorrelated. Another special case is that pairs of bernoulli variables are uncorrelated if and only if they are independent (thanks @cardinal).

(3) The variance/covariance structure (often called simply the covariance structure) in repeated measures designs refers to the structure used to model the fact that repeated measurements on individuals are potentially correlated (and therefore are dependent) - this is done by modeling the entries in the covariance matrix of the repeated measurements. One example is the exchangeable correlation structure with constant variance which specifies that each repeated measurement has the same variance, and all pairs of measurements are equally correlated. A better choice may be to specify a covariance structure that requires two measurements taken farther apart in time to be less correlated (e.g. an autoregressive model). Note that the term covariance structure arises more generally in many kinds of multivariate analyses where observations are allowed to be correlated.

Macro
  • 40,561
  • 8
  • 143
  • 148
  • 2
    your explanation is nice. It is followed by valuable supplement which caused an interesting series of comments. Thanks a lot to all :) ! – abc Jun 07 '12 at 07:23
26

Macro's answer is excellent, but I want to add more to a point of how covariance is related to correlation. Covariance doesn't really tell you about the strength of the relationship between the two variables, while correlation does. For example:

x = [1, 2, 3]
y = [4, 6, 10]

cov(x,y) = 2 #I am using population covariance here

Now let's change the scale, and multiply both x and y by 10

x = [10, 20, 30]
y = [40, 60, 100]

cov(x, y) = 200

Changing the scale should not increase the strength of the relationship, so we can adjust by dividing the covariances by standard deviations of x and y, which is exactly the definition of correlation coefficient.

In both above cases correlation coefficient between x and y is 0.98198.

Akavall
  • 2,429
  • 2
  • 20
  • 27
  • 7
    "Covariance doesn't really tell you about the strength of the relationship between the two variables, while correlation does." That statement is completely false. The two measures are identical modulo scaling by the two standard deviations. – David Heffernan Jun 03 '12 at 15:35
  • 16
    @DavidHeffernan, yes if scaled by standard deviations then covariance tells us about the strength of the relationship. However, covariance measure by it self doesn't tell us that. – Akavall Jun 03 '12 at 16:48
  • 11
    @DavidHeffernan, I think what Akavall is saying is that _if you don't know the scale of the variables_ then covariance does not tell you anything about the strength of the relationship - only the sign can be interpreted. – Macro Jun 03 '12 at 16:51
  • 7
    In what practical situation can you obtain a covariance without also being able to obtain a good estimate of the scale of the variables? – David Heffernan Jun 03 '12 at 18:45
  • Just say it's a "scale-free measure". – Emre Jun 03 '12 at 20:06
  • 7
    However, it is not always necessary to know the standard deviation to understand the scale of a variable and thus the strength of a relationship. Unstandardised effects are often informative. E.g., if doing a training course causes people to on average increase there income by $10,000 per year, that's probably a better indication of strength of effect, than saying that there was a r=.34 correlation between doing the course and income. – Jeromy Anglim Jun 04 '12 at 01:39
  • @DavidHefferman I think the answer to your question is none. To get a covariance estimate you need the paired data. If you have the paired data you have the individual components and can estimate their scale (e.g. standard deviation). Jeromy Anglim makes a very good point about interpreting in the original units (e.g. dollars in his case). But to make a judegement about a $10,000 change requires some idea of what constitutes a large difference. – Michael R. Chernick Aug 29 '12 at 16:26
  • +1, but isn't the `cov(x,y)=3`? – avocado Mar 02 '14 at 11:37
  • @loganecolss, sample covariance would indeed be 3, but I am looking at population covariance. – Akavall Mar 02 '14 at 14:00
  • What is `population covariance`? Is there any other types of covariance? – mrgloom Nov 06 '19 at 22:02
  • > In what practical situation can you obtain a covariance without also being able to obtain a good estimate of the scale of the variables? In my discrete math homework lol – John D Nov 19 '20 at 02:35