2

I want to generate 2 continuous random variables Q1, Q2 (quantitative traits) and 2 binary random variables Z1, Z2 (binary traits) with given pairwise correlations between all possible pairs of them. Say (Q1,Q2):0.23, (Q1,Z1):0.55, (Q1,Z2):0.45, (Q2,Z1):0.4, (Q2,Z2):0.5, (Z1,Z2): 0.47

Please help me generate such data in R.

  • 2
    If you are just asking for code, this question would be off-topic for CV (see our [help center](http://stats.stackexchange.com/help/on-topic)). There is some information about generating correlated continuous variables here: [How to generate correlated random numbers](http://stats.stackexchange.com/q/38856/7290). Generating correlated binary variables is trickier; it depends on how you are thinking about the data generating process. I have a related answer on SO here: [Generate categorical variables with chosen degree of association](http://stackoverflow.com/a/20330903/1217536). – gung - Reinstate Monica May 24 '14 at 14:45
  • 1
    A better place to look for understanding the ideas related to correlated binary variables is this excellent answer by @chl: [Differences between tetrachoric and Pearson correlation](http://stats.stackexchange.com/a/3135/7290). – gung - Reinstate Monica May 24 '14 at 14:48

1 Answers1

1

I have never tried but I think that you can use the Cholesky decomposition.

Generate your variables independently and then multiply the matrix obtained using all the variables for the cholesky decomposition of the covariance matrix that you would like to obtain.

This works when all the variables are normal. Not sure if it work on binary variable. Anyway you can easily try

Donbeo
  • 3,001
  • 5
  • 31
  • 48