1

correlation is computed from covariance so how come covariance can pick up non-linear relationships between variables $X$ and $Y$ but (Pearson's) correlation can't?

develarist
  • 3,009
  • 8
  • 31
  • 1
    I disagree. Pearson correlation can pick up on nonlinear relationships. – Dave Aug 20 '20 at 18:45
  • 1
    are you confusing Pearson correlation for Spearman correlation – develarist Aug 20 '20 at 18:52
  • 1
    Nope. Consider points hugging the right side of a parabola. Pearson correlation will be weaker than Spearman correlation, but Pearson will pick up on that relationship. – Dave Aug 20 '20 at 18:53
  • why do textbooks all say pearson correlation measures linear dependence only – develarist Aug 20 '20 at 18:54
  • 1
    I did not quite understand your question. However, if I did, than this https://stats.stackexchange.com/q/229667/3277 might be of interest to you. – ttnphns Aug 20 '20 at 19:23
  • even there, the non-linearity question wasn't answered but thanks – develarist Aug 20 '20 at 19:30

1 Answers1

4

Covariance and correlation (which is simply scaled covariance) only pick up linear relationships, but this does not mean that a linear relationships only exists if a variable is a linear transformation of another variable.

Strictly speaking, a linear relationship is a relationship of direct proportionality: any given change in an independent variable $x$ will always produce a corresponding change in the dependent variable $y$ , e.g. a 10 percent increase or decrease in $x$ will result in a 10 percent increase or decreas in $y$, that is $y$ is a linear (more technically: affine) transformation of $x$, $y=a+bx$.

This is a perfect linear relationship, for example:

> x <- 1:10
> y <- 3 + 2*x
> cor(x,y)
[1] 1

However, there is some linear relationship, or linear dependance, when increasing or decreasing one variable will cause a corresponding increase or decrease in the other variable, even if $y$ is not a linear transformation of $x$, for example:

> x <- 1:10
> y <- 3 + 2*x^2
> cor(x,y)
[1] 0.9745586

enter image description here

Notice that correlation is less than one because the linear relationship is not perfect.

There is a linear relationship even if $y$ will tend to increase when $x$ increases, but can occasionally decrease when $x$ increases, for example:

> x <- 1:100
> y <- x + tan(x)
> cor(x,y)
[1] 0.7940153

enter image description here

There is no linear relationship if $y$ can equally increase or decrease when $x$ increases (or decreases), for example:

> x <- -10:10   # x is increasing
> y <- x^2      # y is decreasing when x < 0, then increasing
> cor(x,y)
[1] 0
> cov(x,y)
[1] 0

enter image description here

As you can see, when there is no linear relationship, both correlation and covariance are null.

Sergio
  • 5,628
  • 2
  • 11
  • 27
  • 1
    `Covariance and correlation (which is simply scaled covariance) only pick up linear relationships` I don't agree with this, as I've argued in my answer linked in the comments above. Covariance, unlike correlation, is not a coefficient measuring, by the magnitude of its value, just the strength of linear relationship. – ttnphns Aug 20 '20 at 21:30
  • 1
    @ttnphns You are right (covariance is not _a measure_ of the magnitude of linear relationship) but Richard Hardy and Peter Flom (https://stats.stackexchange.com/questions/229667/difference-between-correlation-and-covariance-is-covariance-only-useful-if-the) are not wrong: both correlation and covariance are zero if there is no linear relationship, are not zero if there is linear relationship, so both pick up linear relationships. – Sergio Aug 21 '20 at 05:56
  • wonder why the first guy who commented said covariance can detect non-linearity – develarist Aug 22 '20 at 21:23