5

Here is the situation. (This is not a homework problem.)

I am writing a program that does Cool And Interesting Things starting with a correlation matrix among 3 variables: call them $X$, $Y$, and $Z$. I want the user to be able to specify the correlation matrix using any combination of 3 simple or partial correlations. That is to say, the user supplies the correlation between each pair of variables, but each of those 3 correlations may be either simple or partial.

For example, one possibility would be that the user supplies the simple correlation between $Y$ and $X$, the simple correlation between $X$ and $Z$, and the partial correlation between $Y$ and $Z$ (controlling for $X$). The program should deduce the simple correlation matrix (i.e., convert the one partial correlation to a simple correlation) and then proceed from there.

The program should be able to handle any possible combination of inputs (as long as it ultimately specifies a valid simple correlation matrix.) There are basically 4 possible types of input, namely:

  • 3 simple correlations
  • 2 simple correlations, 1 partial correlation
  • 1 simple correlation, 2 partial correlations
  • 3 partial correlations

I have only the 2 of the 4 cases worked out. In the first case, obviously if the user just supplies the 3 simple correlations, then there is no problem to solve. In the last case, where the user supplies 3 partial correlations, I can obtain the simple correlation by basically reversing the procedure described HERE. But I am having a hard time working out the 2 more interesting cases. I wonder if anyone can help point me in the right direction. Thanks!

NOTE: I have cross-posted this question on the TalkStats forum, where I am an active member, HERE. Please check for answers there before duplicating another's effort.

Jake Westfall
  • 11,539
  • 2
  • 48
  • 96
  • 1
    I am curious about the assertion following "obviously": there are non-trivial relationships the three correlations have to satisfy, so it is quite possible for a user to supply three simple correlations that correspond to no random variables at all. It would seem that there *is* a problem to solve even in this case: you should at least report that the inputs are invalid! – whuber Oct 29 '14 at 23:13
  • Thanks for feedback @whuber, you are absolutely right, although I thought I covered this with my parenthetical note "(as long as it ultimately specifies a valid simple correlation matrix)." Currently the program verifies that the simple correlation matrix has a non-negative determinant. – Jake Westfall Oct 29 '14 at 23:17
  • Thanks for pointing out that parenthetical remark, Jake: I did indeed overlook it. Allow me to point out something I hope is equally obvious: up to permutations--which are simple to handle--there are only four types of input, of which you have solved two, leaving only two problems rather than six. – whuber Oct 29 '14 at 23:18
  • 1
    From the well-known [formula](http://stats.stackexchange.com/a/76819/3277) it follows that in case of two simple and one partial correlation the simple correlation corresponding to the latter is easily computed. As for case "two partial and one simple correlations" I hasitate to say right now. – ttnphns Oct 30 '14 at 08:48
  • @whuber Yes, the question is perhaps more clear if I write the 4 types of input rather than all permutations. Will edit. – Jake Westfall Oct 30 '14 at 15:59
  • @ttnphns Thanks...I am embarrassed I overlooked the solution for that case. – Jake Westfall Oct 30 '14 at 16:00

1 Answers1

1

I have now solved each of the cases that I described, so I thought I'd post the solutions here for posterity.

Case 1: Three simple correlations

The only thing really to do in this case is to just verify that the 3 simple correlations form a valid correlation matrix. This is done by verifying that the correlation matrix has a non-negative determinant.

Case 2: Two simple correlations, One partial correlation

In this case (as pointed out by @ttnphns in a comment) we can can compute the one missing simple correlation by taking the well-known formula for writing a partial correlation coefficient in terms of the simple correlations, $$ r_{ab.c}=\frac{r_{ab}-r_{ac}r_{bc}}{\sqrt{1-r^2_{ac}}\sqrt{1-r^2_{bc}}}, $$ and solving it for the simple correlation term $r_{ab}$, which yields $$ r_{ab}=r_{ab.c}\sqrt{1-r^2_{ac}}\sqrt{1-r^2_{bc}}+r_{ac}r_{bc}. $$

Case 3: Three partial correlations

As explained in the link I posted in my question, to go from a simple correlation matrix to a partial correlation matrix, we simply invert the simple correlation matrix, divide the off-diagonal elements by the square roots of the corresponding diagonal elements (as if we were converting a covariance matrix to a correlation matrix), and multiply each off-diagonal by $-1$. So to reverse this process, we take the partial correlation matrix, multiply the off-diagonals by $-1$, take the matrix inverse, and then divide each off-diagonal by the square roots of the corresponding diagonals as we did before. If you work through these matrix computations by hand (actually I used this Wolfram Alpha widget), we can see that this leads to the following equation for writing a simple correlation in terms of a triplet of partial correlations: $$ r_{ab}=\frac{r_{ab.c}+r_{ac.b}r_{bc.a}}{\sqrt{(r^2_{ac.b}-1)(r^2_{bc.a}-1)}}. $$

Case 4: One simple correlation, Two partial correlations

For this case we can get the one missing partial correlation by taking the formula introduced for Case 3 and solving it for the $r_{ab.c}$ term, which yields $$ r_{ab.c}=r_{ab}\sqrt{(r^2_{ac.b}-1)(r^2_{bc.a}-1)}-r_{ac.b}r_{bc.a}. $$ After solving for the missing partial correlation, Case 4 is reduced to Case 3, which we can solve as described just above.

Jake Westfall
  • 11,539
  • 2
  • 48
  • 96