2

The data I am using is continuous, and may be somewhat skewed, but probably will not have many outliers.

The following rule of thumb has been suggested to me: examine both Pearson and Spearman correlations, and if they are different to report the latter, on the grounds that a difference between the two might indicate that the assumptions of significance testing on Pearson's r are not met.

  • 1
    Think what that rule of thumb may boil down to: never compute Pearson at all, just proceed at once to Spearman. – ttnphns Jul 23 '14 at 10:27
  • I'm unable to see why that follows; I've frequently seen them come out to be very similar, in which case (according to this rule of thumb) we would end up using the Pearson correlation. – user1205901 - Reinstate Monica Jul 31 '14 at 02:43
  • But if they're very similar, why would you bother changing from Spearman - the only times you'd consider using it you'll get essentially the same answer using Spearman, so there'd be no obvious loss by simply always using Spearman. What information would you gain by switching? (This isn't intended as an argument for Pearson, or for Spearman, nor even against choosing between them in some way; it's important to understand your reasoning). – Glen_b Jul 21 '15 at 04:55
  • 1
    I am not sure of what the rationale was meant to be for the rule of thumb, and am not longer able to communicate with the person who suggested it to me. I can speculate that it might stem from a desire to report the more familiar (to the audience in question) Pearson correlation unless there's a good reason to use the Spearman correlation instead. – user1205901 - Reinstate Monica Jul 21 '15 at 05:08
  • Strongly related: http://stats.stackexchange.com/questions/8071/how-to-choose-between-pearson-and-spearman-correlation – Tim Jul 21 '15 at 08:00

1 Answers1

1

Pearson and Spearman correlation coefficients do not have the same meaning. Pearson gives information on the linear dependency between the two variables while Spearman give information on the generic relationship between the two variables.

For example in R,

x = rnorm(1000) # simulate 1000 samples of a standard Gaussian variable
cor(x, exp(100 * x), method = "spearman") # calculate Spearman correlation
# [1] 1 
cor(x, exp(100 * x), method = "pearson") # calculate Pearson correlation
# [1] 0.101593

You should report the correlation coefficient that makes more sense for your needs.

ThePawn
  • 1,091
  • 6
  • 7
  • However, if you're prepared to assume a linear relationship, then the Spearman generally performs quite well as a way to estimate its strength, even in the presence of some outlier process that can render the Pearson uninformative – Glen_b Jul 21 '15 at 12:06