6

I have two continuous variables that may have nonlinear relationship. Scatter plot of two variables showed an ellipse shape. Furthermore, both Pearson correlation coefficient and Spearman's rank correlation coefficient were calculated and they were 0.624 and 0.619 respectively.

Is this indicate a linear relationship? How can I verify whether it is a linear or nonlinear relationship, Is there a linearity test?

SAN
  • 61
  • 1
  • 4
  • 1
    a related post http://stats.stackexchange.com/questions/218127/intuition-behind-pearson-correlation-co-variance-and-cosine-similarity – Haitao Du Sep 09 '16 at 14:38
  • A fairly simple test is to run a linear regression: $y = \alpha + \beta_1 x + \beta_2 x^2 + \beta_3 x^3 + \epsilon$ or something similar. Check that the coefficients on $\beta_2$ and $\beta_3$ are statistically zero. – lmo Sep 09 '16 at 14:40
  • The correlation measures the degree of linear dependency, it's does not by it self indicate that a linear relationship exists. Why not try both a linear and non linear method and see which work better? – Repmat Sep 09 '16 at 14:42
  • Thanks for the reply. But, if there is a test then my finding will be confirmed. I'll try both linear and nonlinear methods. – SAN Sep 09 '16 at 14:47

1 Answers1

8

Furthermore, both Pearson correlation coefficient and Spearman's rank correlation coefficient were calculated and they were 0.624 and 0.619 respectively. Does this indicate a linear relationship?

No, not necessarily. You can build datasets which have 0.6, but dependence is strongly non-linear, or nearly linear/comonotonic but with anti-tail-dependence (high extreme values for ones correspond to low extreme values for the other, and conversely).

You can display an empirical copula: You sort the values for X (and divide by the number of values), you sort the values for Y (and divide by the number of values). You can then plot a 'normalized' scatterplot or an estimated density of this bivariate distribution of uniform marginals. The perfect positive dependence (comonotonic relationship) is depicted by the diagonal of $[0,1]^2$. For some python code and empirical copulas illustration, you can have a look there.

rAntonioH
  • 3
  • 3
mic
  • 3,848
  • 3
  • 23
  • 38
  • Could you please explain it more? – SAN Sep 23 '16 at 10:39
  • 1
    You can have a look at https://arxiv.org/pdf/1610.09659.pdf it is explained how to read the empirical copula measures to understand what kind of relationship there is between two variables. – mic Nov 01 '16 at 09:42
  • 1
    is there a source that describes that binning procedure of sorting values, dividing them by their sample frequency, etc? all of the copula textbooks are fixated on integrals/theory and not actual implementation, which is basically copula discretization – develarist Aug 28 '20 at 18:56
  • 1
    yes, but it's old and hard to find (and in French): fonction de dépendance empirique (empirical copula) by Paul Deheuvels, 1979 – mic Aug 31 '20 at 02:11