2

How can I fit a model to predict the Pearson correlation coefficient between x and y as a function of z? All three variables are continuous.

The best I can come up with is to bin z and calculate the correlation separately within each bin. It seems there must be a better way to do this that preserves z as a continuous variable.

A concrete example: I'm interested in how mean annual rainfall correlates between two locations as a function of the distance between those locations. I have a dataset of sites with measured rainfall and know the distance between all pairs of sites.

bcbvi
  • 31
  • 2
  • Do you mean $cor(X\vert Z=z, Y\vert Z=z)?$ – Dave Aug 28 '20 at 14:35
  • What is the reason not to use a multiple regression with an interaction, which would be the automatic way to do this? – whuber Aug 28 '20 at 15:06
  • @whuber The reason is just my lack of knowledge. I think I understand the suggestion though: Set up a regression like so $x = \beta_0 + \beta_1 y z$ then $\beta_1 z$ is the covariance between x and y. Is that correct? If so that would be an answer to my question. Thanks! – bcbvi Aug 28 '20 at 17:23
  • @Dave yes: I think that captures my question. – bcbvi Aug 28 '20 at 17:25
  • @whuber revising my comment: Presumably I'd need to set it up as: $(X-\overline{X}) = \beta (Y-\overline{Y}) Z + \epsilon$? If this is along the right lines I'd like to accept your solution as a full answer. – bcbvi Aug 28 '20 at 17:44
  • 1
    You need a richer model of the form $$x = \beta_0 + \beta_y y + \beta_z z + \beta_{yz}yz + \varepsilon$$ where $x$ and $y$ have been standardized (rescaled to unit variance). Then for any value $z=z_0,$ the regression is $E[x\mid y, z=z_0] = (\beta_0+\beta_zz_0) + (\beta_y + \beta_{yz}z_0) y$ and the coefficient of $y$ in this model estimates the Pearson correlation conditional on $z_0.$ – whuber Aug 28 '20 at 18:07
  • You could start with some visualization, have a look at https://stats.stackexchange.com/questions/203494/can-i-analyze-or-model-a-conditional-correlation/368228#368228 – kjetil b halvorsen Aug 29 '20 at 05:07
  • @whuber thanks for your comments. I'm not able to solve my specific problem with this exact model because I'm really trying to solve for $z=e^{-d/d_0}$ and I'm not sure how to apply this approach. Your comments answer the question which I don't think exists elsewhere on stackexchange so I'll leave this question as-is rather than edit it. I've posted a more specific question here https://stats.stackexchange.com/questions/485412/correlation-of-two-continuous-variables-as-a-function-of-the-exponential-decay-o – bcbvi Aug 31 '20 at 13:21

0 Answers0