0

I am examining how well two non-stationary time series, which are supposed to be representing the same thing, match each other. I believe I should, therefore, be specifically looking for a linear relationship in the graph of one against the other. However, my data does not fulfill the normality assumption of person's r. I know that the non-parametric equivalent is spearman's rho; however, I are not sure if measuring the correlation for monotonic relationships in general is useful in this case. Therefore, I was considering using RMSE as a measure, by treating the more authoritative time series as the actual values and the other as a predictive model of the first for the purposes of calculating residuals. I intend to use this for calculating a lag time and to replace the cross correlation calculation. Are their any statistical problems with this or are their any better methods? Also, since this is replacing cross correlation, how does the problem of intra/auto-correlation in relation to cross correlation effect this RMSE method?

H Huang
  • 101
  • 4
  • 1
    There is no normality assumption behind Pearson's $r$. – Richard Hardy Jan 13 '21 at 09:05
  • My understanding was that as a parametric test, it could only be applied to data that comes from a population with a normal distribution. – H Huang Jan 13 '21 at 19:34
  • Not quite. First, Pearson's $r$ is not a test; there is no hypothesis and no decision being taken. Second, parametric tests do not always assume a normal distribution; they may assume all kinds of distributions. – Richard Hardy Jan 13 '21 at 19:38
  • Looking online for the assumptions of Pearson's $r$, I've found a lot of articles like [https://journals.lww.com/anesthesia-analgesia/Fulltext/2018/05000/Correlation_Coefficients__Appropriate_Use_and.50.aspx] (this), which mention normality as one of the assumptions. They aren't all worded the same, but they mostly do mention normality in some way. Are they wrong, or have I misunderstood what they mean? – H Huang Jan 13 '21 at 19:52
  • The statement *The Pearson correlation coefficient is typically used for jointly normally distributed data* from the linked document is simply not generally true. Maybe in the particular field of application it is, but the statistical theory does not require that. It reminds me of a common misconception that normality is an important regression assumption (its importance is limited to cases where small sample distributional characteristics of regression coefficients matter). – Richard Hardy Jan 13 '21 at 20:16
  • I have recently read Rogers & Nicewander ["Thirteen ways to look at the correlation coefficient"](https://www.jstor.org/stable/pdf/2685263.pdf) (1988), perhaps it could be useful, though not sure (they do not discuss the probabilistic properties as they note at the bottom of the first column on p. 61). – Richard Hardy Jan 13 '21 at 20:22
  • The article was quite illuminating, thank you for sharing it! I noticed that when they said that normality was not a necessary assumption, they cited [The needless assumption of normality in Pearson's r.] (https://www.semanticscholar.org/paper/The-needless-assumption-of-normality-in-Pearson's-Nefzger-Drasgow/7234bd7b4ed336b9171af7dcaba7cf81ca8bc781) , which indicates in the first couple paragraphs that the assumption of normality lacks academic consensus. Has that been established since? – H Huang Jan 14 '21 at 01:28
  • The statistical theory is pretty clear about it as far as I can understand. At the same time, fighting misconceptions is often hard. The sensible default position should be *an assumption is not needed unless proven otherwise* -- rather than *the assumption is needed unless proven otherwise*. If someone thinks they need an assumption, the burden of the proof should be on them. – Richard Hardy Jan 14 '21 at 06:09
  • [Pearson's or Spearman's correlation with non-normal data](https://stats.stackexchange.com/questions/3730/pearsons-or-spearmans-correlation-with-non-normal-data/3733?) – user2974951 Jan 14 '21 at 06:11

0 Answers0