1

I want to generate random samples from normal distribution such that :

$X \sim \mathcal N(u_1,s_1)$

$Y \sim \mathcal N(u_2,s_2)$

and $\mathrm{cor}(X,Y)=k$ {k is non zero}.

If k=0 then X and Y can be generated easily in R, like :

X <- rnorm(n,mean, sd)
Y <- rnorm(n,mean, sd)

But I donot know how can I generate joint random samples from Normal distribution such that their correlation is not equal to zero.

It would be helpful if you provide procedure with R code.

Sven Hohenstein
  • 6,285
  • 25
  • 30
  • 39
Neeraj
  • 2,150
  • 18
  • 28
  • See `mvnorm` in MASS library or https://cran.r-project.org/web/packages/mvtnorm/index.html , however as strictly software-related this kind of questions are off-topic on this site. – Tim Nov 10 '15 at 07:21
  • 1
    As a pure-R "what function should I call" this would be off topic. As a "what's a way to generate correlated multivariate normals (that I could implement in R)" question it would be on topic but has already been answered several times. The specific bivariate case has also been treated several times. – Glen_b Nov 10 '15 at 07:29
  • It is not pure R related question. I want to know the process in simple ways. I found similar question on [http://stats.stackexchange.com/questions/38856/how-to-generate-correlated-random-numbers-given-means-variances-and-degree-of] but it is difficult to understand as it involve matrix notation. I want answer in simple explanation. – Neeraj Nov 10 '15 at 07:33
  • 1
    For the multivariate case, the matrix version *is* the simple way to do it. [Here's the bivariate case](http://stats.stackexchange.com/questions/13382/how-to-define-a-distribution-such-that-draws-from-it-correlate-with-a-draw-from), which can be done easily without matrices. Does that cover what you need? – Glen_b Nov 10 '15 at 09:51

1 Answers1

1

Here I'm doing it the hard way, but if you go through it, it might be enlightening. The basic idea is I construct the covariance matrix $\Sigma$, then use the eigenvector decomposition to compute the matrix $\Sigma^{1/2}$. Then if $u = (u_1,u_2)$, we get the formula: $(X,Y) = u + \Sigma^{1/2}z$, where the $z$ are a pair of independent standard normal random variables.

For example:

s1 = 2
s2 = 4
u1 = 10
u2 = 12
r = 0.8
covar = r*s1*s2
Sigma = matrix(ncol=2,nrow=2,c(s1^2,covar,covar,s2^2))
temp = eigen(Sigma)
SqrtSigma = temp$vectors%*%diag(sqrt(temp$values))%*%t(temp$vectors)
XYvec = c(u1,u2) + SqrtSigma%*%rnorm(2)
XYvec
#           [,1]
# [1,]  9.265028
# [2,] 11.230126

I'll check it with a simulation:

x = rep(NA,1000)
y = rep(NA,1000)
for(i in 1:1000){
  XYvec = c(u1,u2) + SqrtSigma%*%rnorm(2)
  x[i] = XYvec[1]
  y[i] = XYvec[2]
}
var(x)
# [1] 4.060218
var(y)
# [1] 16.18099
cor(x,y)
# [1] 0.8002916

mean(x)
# [1] 9.935937
mean(y)
# [1] 11.94783
AlaskaRon
  • 2,219
  • 8
  • 12
  • I appreciate your answer. Can you use non matrix notation to simplify that? I have no idea about eigenvector decomposition. – Neeraj Nov 10 '15 at 10:26