3

Which correlation coefficient is the most appropriate to compare 2 time series? I want to compare the variation of one variable for 2 regions, have regional data for the last 30 years. Is Pearson correlation ok or should I rely on Kendall's tau b or Spearman's rho and why? I tried to google it and analyse what I found, but I'm still not sure.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
user36799
  • 31
  • 1
  • 2
  • 2
    I'm no expert in time series, but browsing our [related threads](http://stats.stackexchange.com/questions/tagged/correlation+time-series) might bring useful information. – chl Jan 02 '14 at 21:33
  • 2
    looks very similar to this question: http://stats.stackexchange.com/questions/80577/correlating-time-series-for-20-regions-spss/80605#80605 – forecaster Jan 02 '14 at 21:44
  • 2
    and [this question](http://stats.stackexchange.com/q/23993/32036), and [this](http://stats.stackexchange.com/q/19103/32036), and [this](http://stats.stackexchange.com/q/3943/32036), and [this](http://stats.stackexchange.com/q/18112/32036)... – Nick Stauner Jan 02 '14 at 21:58
  • i've browsed related threads. there's no answer to my question among the threads you guys linked. i mean correlating time series - which probably makes a difference. i found the information that for example pearson correlation does not make so much sense with time series. no idea why. – user36799 Jan 02 '14 at 23:20
  • 2
    the 3rd & 4th threads I linked explain some differences among the 3 correlation estimators that you mentioned specifically, but they mainly differ in what kinds of data they best address in other terms (continuous vs. ordinal, normally distributed vs. nonparametric). they don't take into account temporal contiguity across your individual time series of observations, so a method that would might be preferable, depending on what exactly you want to estimate. you should probably be more specific about that if you can; it's impossible to see how your question is different from others as written. – Nick Stauner Jan 02 '14 at 23:38
  • 3
    If, after considering some more sophisticated methods of comparing time series, you agree that Pearson's $r$ isn't appropriate to your analysis, Kendall's $\tau$ and Spearman's $\rho$ aren't likely to be much better. You may need to reformulate your question in that case, because you wouldn't just be talking about a simple correlation, I suspect. – Nick Stauner Jan 02 '14 at 23:41
  • You mean that when I use Pearson correlation and other listed it doesn't treat the time series as a certain order but mixes the data within the time series together (from the smallest to the biggest number)? Do I understand you correctly? I thought it interpreted series as ... series, specific sequences. – user36799 Jan 02 '14 at 23:53
  • 2
    Pearson's $r$ ignores the order of repeated observations of any one variable; it only preserves pairings between observations of separate variables. In your two time series, you have observations of two separate variables from each year, right? Pearson's $r$ would retain the information that those two variables are observed in the same year, but it would ignore which year. That is, it wouldn't preserve the chronological order information of your within-year pairs of the two variables; shuffling your within-year pairs wouldn't make a difference in the $r$ you'd calculate (nor $\tau$ or $\rho$). – Nick Stauner Jan 03 '14 at 00:15

2 Answers2

0

For time series some version of Pearson correlation is most used, in the form of the autocorrelation function (for one series, correlated with itself at various lags) and the cross-correlation function (for two series) likewise. They are correct when all conditional expectation are linear.

If you suspect that may not be the case, you should start with some visualization of the two series! I have not seen any detailed descriptive analysis of two time series, that would be rather interesting ... In R you could play with the function coplot and you could make scatterplot matrices, replacing what would be one number in each of the two functions above (autocorrelation, crosscorrelation) with a scatterplot. You could also look into copulas used with time series.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
0

What problem are you trying to solve? A correlation between the variance of two regions doesn't make sense if you exclude the temporal dimension. At each time step the variance has a probability distribution and thus you have an infinite number of distributions for which you are observing a finite number of samples. You need to compare these stochastic processes and assess their differences.

Chris
  • 681
  • 4
  • 13