2

I have a question in regards to correlations that I am trying to uncover using R. I have two data sets that span over the years 2010-2016 and include all counties within a state. I am trying to correlate these data sets with each other by year (looking at inter-county correlations), and then by county (looking at year-based correlations). The variables are all continuous.

The relationship between the variables--Number of Visits and Variable1,...,Variable 19-- are all non-linear over the total span of years, 2006-2016 (parabolic and s-curves, mostly).

I believe that the assumption to use Pearson's Correlations and linear regression, is that the variables must have a linear relationship, making them invalid choices.

This is a rather new topic to me, and I was wondering if anyone had any ideas or thoughts on what might be the best route to take to find correlations for my non-linear data.

Thank you in advance!

[I originally posted this on stack overflow and it was recommended that I move the question to this site instead.]

  • I wouldn't call it an assumption. Nonlinear relationships do not make the use of Peason's correlations invalid. It is just that it would be inappropriate to draw conclusions based on Pearson's correlation without being mindful of possible nonlinearities in the relationship. – Moss Murderer Feb 08 '18 at 20:38
  • Related question here: https://stats.stackexchange.com/questions/8956/spearmans-or-pearsons-correlation-with-likert-scales-where-linearity-and-homos/8958#8958 – AdamO Feb 08 '18 at 21:07

1 Answers1

0

Try rank order correlation coefficients such as Kendal $\tau$ or Spearman $\rho$. They do not assume linear relationships, and instead go for rank ordering of the observations in series. They're more robust to outliers too.

Aksakal
  • 55,939
  • 5
  • 90
  • 176
  • Rank statistics can summarize *monotonic* trends. While the OP remarks that some the nonlinear trends he/she inspected tend to follow shapes which are monotonic, the rank statistics in general are not powered to detect nonlinear associations. – AdamO Feb 08 '18 at 21:10
  • @AdamO OP didn't mention anything about non-monotonic. Rank statistics work great for monotonic non-linear relationships. e.g. see wikipedia https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient – Aksakal Feb 08 '18 at 21:15
  • Nonlinear includes monotonic and nonmonotonic. GAMs, LOESS, or splines would be even more general. – AdamO Feb 08 '18 at 21:17
  • @AdamO, OP is not asking about modeling, seems to be looking for a correlation measure at the moment – Aksakal Feb 08 '18 at 23:49