2

If I want to generate a matrix 10,000 (row) samples of 3 uniform (uncorrelated) variables it is trivial to use antithetic draws to ensure the odd moments such as the mean equal their "true" value. Julia/Matlab code below:

w1=rand(5000,3)
w2 = 1 - w1
w = [w1;w2]

Each column of w will then have mean of 0.5. Can someone please tell me a method of ensuring the covariances equal their "true" values (covariance=0 and variance = 0.0833)? From looking at the MASS package in R it has a function "mvrnorm" that when used with argument "empirical=TRUE" returns multivariate normal samples whose mean and covariance match exactly their true values. I've failed on the web to find out how this is achieved. Any help much appreciated for either multivariate uniform or normal.

Steve Kay
  • 21
  • 2
  • When in doubt, read the source code. In R, it's freely available online (on CRAN), or by installing `MASS` and typing `MASS::mvrnorm` – shadowtalker Feb 25 '15 at 14:19
  • 2
    You could read to source code for `MASS::mvrnorm` for more details, preferrably from the tarball as it will contain any comments in the source: http://cran.r-project.org/src/contrib/3.1.3/Recommended/ Plus look at the help for `?MASS::mvrnorm` as it may contain a pointer to literature on the implementation. – Gavin Simpson Feb 25 '15 at 14:22
  • Thanks for your help Gavin and ssdecontrol. Have looked at the code and can replicate (if not fully understand - will look into). Would tick your comments but not sure how to. Thanks again. Steve – Steve Kay Feb 25 '15 at 18:19
  • Does [this answer](http://stats.stackexchange.com/questions/120179/generating-data-with-a-given-sample-covariance-matrix/120227#120227) discuss what you need? – Glen_b Feb 26 '15 at 23:08

1 Answers1

1

The following MATLAB code will remove the sample covariance and impose a specified covariance for normal distributions:

x = randn(100,3);
Sigma0 = cov(x);
Sigma1 = [1 0 .135; 0 1 .115; .135 .115 1];
z = (x/chol(Sigma0))*chol(Sigma1);

You could patch up the means as well, afterward. I would do it by subtracting the observed mean and adding the desired mean, rather than use the method you described.

Tom Lane
  • 814
  • 5
  • 3
  • Thanks for this. The problem with just subtracting the difference between the observed and desired mean from all obs is that it affects the tails of the distribution. – Steve Kay Mar 03 '15 at 14:18