1

If I have two variables X, Y and I already have correlations for subsets of the data that that are mutually exclusive and exhaustive, can I compute the overall correlation directly from this?

It seems intuitively we should be able to just take the weighted average correlation but I’m not sure this is totally sound. I have a feeling we probably need to have extra assumptions that the means and st.devs etc are constant among the different subsets?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
  • 4
    The answer is a definite no, even making those extra assumptions: you can find discussions under the heading of "Simpson's Paradox," among other things. – whuber Dec 28 '20 at 20:32
  • The thread at https://stats.stackexchange.com/a/51927/919 shows what it takes to combine two *covariances.* To combine correlations, you have to convert the correlations into covariances, combine the covariances, and compute the resulting correlation. There is no algebraic simplification available. – whuber Mar 27 '21 at 15:53

0 Answers0