12

I am trying to prove or disprove that the difference between Spearman's Correlation and Kendall's Correlation is no more than 1 (or less, the tighter the merrier).

I am assuming there are no ties.

In an attempt to disprove the result using a counter example, I checked all possibilities for vectors with length 8. Got some pretty pictures but no counter example:

difference:

The difference is never more than 0.4 in this case, so I think it is true, but I was unable to prove it.

Pqqwetiqe
  • 121
  • 3
  • 1
    There is a very interesting post that could be a partial duplicate to your question. It is "Kendall Tau or Spearman's rho? stats.stackexchange.com/questions/3943/kendall-tau-or-spearmans-rho . – Michael R. Chernick Jan 18 '17 at 22:57
  • 1
    For those who might want to tackle a direct algebraic approach, I believe the result can be obtained in two steps. First (the key step), show that the extreme absolute value of the difference is attained for the data $$(1,n),(2,n-1),\ldots, (n,1),(n+1,2n),(n+2,2n-1),\ldots,(2n,n+1)$$ for $2n$ points and $$(1,n+1),(2,n),\ldots,(n+1,1),(n+2,2n+1),(n+3,2n),\ldots,(2n+1,n+2)$$ for $2n+1$ points. Then just calculate the differences for these datasets. (In the first case there's another maximum and in the second case there are three other maxima implied by obvious symmetries.) – whuber Jan 19 '17 at 00:25
  • I came to say that I think the maximum absolute difference for odd $n$ is going to be $(n-2)/2n – Glen_b Jan 19 '17 at 00:46
  • 2
    @Glen_b If I am correct, then the maximum absolute difference for data of length $n$ is $$\frac{2 (n-2) \left\lceil \frac{n}{2}\right\rceil \left\lfloor \frac{n}{2}\right\rfloor }{n \left(n^2-1\right)},$$ which has a limiting value of $1/2$ (from below) as $n\to\infty.$ That supports what you wrote. This formula is related to [A111384](http://oeis.org/A111384), whose values are divided by $n(n^2-1)/4$. – whuber Jan 19 '17 at 01:03
  • 1
    That bound seems to match your formula for even n (and your limiting cases in the earlier comment certainly seem to match those obtained by exhaustive calculation for all the small $n$ values I could readily check - but I expect you already did that). It's interesting that the bound is 1/2. Did I make a mistake in the odd n case? (edit: No, I see now, I was right apart from manipulating your formula) – Glen_b Jan 19 '17 at 01:12
  • 2
    @Glen_b A *limit* of $1/2$ is intuitive: for the patterns I described, Spearman is close to $1/2$ while Kendall is $O(1/n)$. The algebra is simplified by generalizing my "crayon" approach to covariance. Here's `R` code implementing the relevant formulas. The arguments consist of two permutations of `1:n`. **Spearman**: `function(x, y) mean(outer(x, x, '-') * outer(y, y, '-')) * 6 / (length(x)^2 - 1)` **Kendall**: `function(x,y) mean(sign(outer(x, x, '-')) * sign(outer(y, y, '-'))) * (1 + 1/(length(x)-1))` – whuber Jan 19 '17 at 01:18

1 Answers1

0

You may want to look at this paper! And other works by these authors. I can't remember exactly where, but I have seen your first graph in their papers, and some proofs along with it. I think this can be done by leveraging copulas (as Kendall tau and Spearman rho can be written as a function of the underlying copula between the two variables). Hope it helps.

$C$ is the copula of $(X,Y)$.

$\tau(X,Y) = 4\int_0^1 \int_0^1 C(u,v) c(u,v) du dv - 1$

(Kendall correlation is the expectancy of the copula rescaled into $[0,1]$)

$\rho(X,Y) = 12 \int_0^1 \int_0^1 C(u,v) du dv - 3$

Then, $|\tau - \rho| \leq \ldots$

mic
  • 3,848
  • 3
  • 23
  • 38
  • 2
    The paper is a nice reference for the techniques it exhibits. It does not, however, appear to contain a result that would easily imply the one conjectured in this question. That is primarily because its results are not universal: they apply under various restrictive conditions and even then only in the limit as the joint distribution approaches independence. – whuber Jan 18 '17 at 23:55