I have tried to collect a few remarks on distance covariance based on my impressions from reading the references listed below. However, I do not consider myself an
expert on this topic. Comments, corrections, suggestions, etc. are welcome.
The remarks are (strongly) biased towards
potential drawbacks, as requested in the original question.
As I see it, the potential drawbacks are as follows:
- The methodology is new. My guess is that this is the single
biggest factor regarding lack of popularity at this time. The
papers outlining distance covariance start in the mid 2000s and
progress up to present day. The paper cited above is the one that
received the most attention (hype?) and it is less than three years
old. In contrast, the theory and results on correlation and
correlation-like measures have over a century of work already
behind them.
- The basic concepts are more challenging. Pearson's
product-moment correlation, at an operational level, can be
explained to college freshman without a calculus background
pretty readily. A simple "algorithmic" viewpoint can be laid
out and the geometric intuition is easy to describe. In contrast, in the case of distance covariance, even the notion of sums of products of pairwise Euclidean
distances is quite a bit more difficult and the notion of
covariance with respect to a stochastic process goes far beyond
what could reasonably be explained to such an audience.
- It is computationally more demanding. The basic algorithm for
computing the test statistic is $O(n^2)$ in the sample size as
opposed to $O(n)$ for standard correlation metrics. For small
sample sizes this is not a big deal, but for larger ones it
becomes more important.
- The test statistic is not distribution free, even
asymptotically. One might hope that for a test statistic that is
consistent against all alternatives, that the
distribution—at least asymptotically—might be
independent of the underlying distributions of $X$ and $Y$ under
the null hypothesis. This is not the case for distance covariance
as the distribution under the null depends on the underlying
distribution of $X$ and $Y$ even as the sample size tends to
infinity. It is true that the distributions are uniformly
bounded by a $\chi^2_1$ distribution, which allows for the
calculation of a conservative critical value.
- The distance correlation is a one-to-one transform of $|\rho|$ in
the bivariate normal case. This is not really a drawback, and
might even be viewed as a strength. But, if one accepts a
bivariate normal approximation to the data, which can be quite
common in practice, then little, if anything, is gained from
using distance correlation in place of standard procedures.
- Unknown power properties. Being consistent against all
alternatives essentially guarantees that distance covariance must
have very low power against some alternatives. In many cases, one
is willing to give up generality in order to gain additional
power against particular alternatives of interest. The original
papers show some examples in which they claim high power relative
to standard correlation metrics, but I believe that, going back
to (1.) above, its behavior against alternatives is not yet well
understood.
To reiterate, this answer probably comes across quite negative. But,
that is not the intent. There are some very beautiful and interesting
ideas related to distance covariance and the relative novelty of it
also opens up research avenues for understanding it more fully.
References:
- G. J. Szekely and M. L. Rizzo (2009), Brownian distance
covariance, Ann. Appl. Statist., vol. 3, no. 4, 1236–1265.
- G. J. Szekely, M. L. Rizzo and N. K. Bakirov (2007), Measuring and
testing independence by correlation of distances, Ann. Statist.,
vol. 35, 2769–2794.
- R. Lyons (2012), Distance covariance in metric spaces,
Ann. Probab. (to appear).