2

I was wondering if there is a method that is equivalent to Latin Hypercube Sampling when the input space you are trying to sample from are a finite discrete set of possible values. For example, if I had two variables $x_1,x_2\in\mathcal{X}$ where $\mathcal{X}=\{1,2,3,4,5,6,7,8,9,10\}$ would there be a way to sample from this which is equivalent to LHS? In the case of two variables it doesn't seem like a tough problem but when you have 3 or more I imagine it can become quite complicated.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Dan
  • 964
  • 6
  • 15

2 Answers2

1

Not sure if anyone is still worried about this, but I also had this question last year. I did implement a sampling design I called a Latin Hyper-rectangular Design (LHrD) for sampling from multiple discrete factors with different numbers of levels for a previous paper - assuming a uniform probability for each factor.

This method preserves the latin property, and randomizes entries so diagonal sampling is not the preference. See the supplemental data of our paper for the algorithm.

This has not been subjected to rigorous testing, but I am assuming that the general properties of space filling designs will hold, albeit with a little more variability given the asymmetry in the dimensions.

Edit: Latin hypercube sampling has been implemented as an R package for multivariate empirical distributions by Pierre Roudier et al, following Minansy and McBratney 2006, this is much better than my simple attempt.

sdnj
  • 11
  • 3
-1

Yes, you can do this as long as each variable has the same number of values. The number of samples you get is equal to the number of values, so in your example there will only be 10 samples in a given batch. The algorithm works as follows: 1. For each variable, create a set holding all of its remaining values. Initially this is the set of all possible values. 2. To draw a sample, go through each variable and draw a value uniformly from its set of remaining values. Remove the drawn value from the set. 3. Repeat until all sets are exhausted.

Tom Minka
  • 6,610
  • 1
  • 22
  • 33
  • Can you give a canonical reference for this solution? – Dan Sep 02 '14 at 16:56
  • 1
    See page 3 of [A User’s Guide to LHS: Sandia’s Latin Hypercube Sampling Software](http://prod.sandia.gov/techlib/access-control.cgi/1998/980210.pdf) – Tom Minka Sep 03 '14 at 18:33
  • All that page 3 and onward describes is latin hypercube sampling in general. There is no proposed solution in there for the case of finite discrete sets. – Dan Sep 03 '14 at 19:05
  • 1
    The first step of their algorithm converts the continuous variables into discrete ones, and proceeds on the discrete values. If the variables are already discrete, you just skip that first step. – Tom Minka Sep 04 '14 at 12:18
  • @Tom Minka: Could you add this extra information to the answer? – kjetil b halvorsen Feb 23 '19 at 11:25