1

How can I create two random sequences of numbers that are $50\%$ correlated to each other? Each sequence has a different variance.

sooprise
  • 151
  • 5
  • I notice you've only accepted answers on 14% of the questions you've asked, and it looks (to me) like several at least have received satisfactory answers, perhaps you could go back to those questions and accept the answers you feel are appropriate. –  Dec 03 '11 at 22:52
  • What do you want the distributions to be? And do you want the correlation of the generated numbers to be exactly 50%, or do you want to generate random numbers from a joint distribution whose correlation is 50%? –  Dec 03 '11 at 23:45
  • both normally distributed, the correlation of the generated numbers should be exactly 50% –  Dec 04 '11 at 00:22
  • Perhaps you have some thoughts on the matter and something that's got you stuck? – cardinal Dec 04 '11 at 00:47
  • 2
    See duplicates, [How to define a distribution that correlates with a draw from another distribution?](http://stats.stackexchange.com/q/13382/1036) and [Generate a random variable with a defined correlation to an existing variable](http://stats.stackexchange.com/q/15011/1036) – Andy W Dec 05 '11 at 13:18
  • The only possible difference is the note of different variances, although to account for that is really trivial, just multiplying the vectors to change the variance will not change the correlation. – Andy W Dec 05 '11 at 13:25

1 Answers1

2

Generate a pair of independent $N(0,1)$ random variables $X$ and $Y$ using, for example, the Box-Muller transform or Marsaglia polar method. Then set $$\begin{align*} A & = X,\\ B &= \frac{X + \sqrt{3}Y}{2} \end{align*}$$ Then $A$ and $B$ are $N(0,1)$ random variables with correlation coefficient $0.5$. $A$ is the first element of one of your two desired sequences and $B$ is the first element of the other sequence. Repeat to get the second elements of your sequences. Continue this process for as long as needed.

Edit: I just noticed that you want the sequences to have different variances, though you don't say what you want the variances to be. This is easy to fix. Set
$$\begin{align*} A & = \alpha X,\\ B &= \beta \frac{X + \sqrt{3}Y}{2} \end{align*}$$ where $|\alpha| \neq |\beta|$ to get sequences with different variances $\alpha^2$ and $\beta^2$.

Dilip Sarwate
  • 41,202
  • 4
  • 94
  • 200
  • @Andy W's comments on the main question are spot on, and my answer above is less general than the ones to previous questions. But note that my answer was posted on math.SE when this question was asked on that site. – Dilip Sarwate Dec 05 '11 at 13:41
  • Hi Dilip, although this answers the OP's question, I believe it could be usefully expanded upon in several ways. 1) It would good to note that this does not produce an exact correlation between $A$ and $B$, 2) this solution takes advantage of the fact that the requested correlation is 0.5, it might be worthwhile to expand it to other situations in which the correlation is either higher or lower, 3) it would be worthwhile to talk about why the resulting $B$ vector would have an expected variance of 1 (e.g. where does the $\sqrt{3}$ and $2$ come from?) – Andy W Dec 05 '11 at 14:10
  • @AndyW The [answer by Macro](http://stats.stackexchange.com/a/13384/6633) to the question you pointed out says that one should use $\rho X + \sqrt{1-\rho^2}Z$ where $X$ and $Z$ have the same distribution. Here $\rho = \frac{1}{2}$ and so $\sqrt{1-\rho^2} = \frac{\sqrt{3}}{2}$, and I am not sure if editing my answer to explain how the $\sqrt{3}$ and $2$ came about will really add anything new. I think I will just leave well enough alone. – Dilip Sarwate Dec 05 '11 at 22:14
  • Ah, this is quite a good solution. Can you please tell me how you arrived to the equation for A, and B to get .5 correlation? That is the only part I don't understand. I did put together the equations in Excel and everything is checking out :) – sooprise Dec 07 '11 at 20:44
  • See the references in Andy W's comments on the main question (made after it was migrated to stats.SE). More directly, having chosen $X=A$ (for simplicity), set $B=cX+dY$. Then, $E[B]=0$, $\text{var}(B)=c^2+d^2$, and $E[AB]=cE[X^2]+dE[XY]=c$. Then, we want $$\rho_{A.B}=\frac{\text{cov}(A,B)}{\sigma_A\sigma_B}=\frac{c}{1\times\sqrt{c^2+d^2}}=\frac{1}{2}$$ and so choosing $d=\sqrt{1-c^2}=\frac{\sqrt{3}}{2}$,$c=\frac{1}{2}$, we get the desired relationship. More generally, as pointed out in a comment above, choose $c=\rho, d=\sqrt{1-\rho^2}$ to get any desired correlation (positive or negative). – Dilip Sarwate Dec 07 '11 at 21:18