2

Given a uniformly distributed sample of data, It's needed to sub-sample out the points in a Normal distribution fashion, i.e. more around mean and sparser as we move out. What could be the steps?

Saransh
  • 23
  • 4
  • Interesting question. What do you need this for? – Stephan Kolassa Oct 16 '17 at 11:02
  • @StephanKolassa I'm guessing iterating over each point and the using result of normal equation at that point as the probablity of sub-sampling that point might work? I'm writing my final year paper, need it to test my hypothesis. – Saransh Oct 16 '17 at 11:10

1 Answers1

4

Subsample your original data without replacement. The crucial point is to weight each of the original points with the density you are aiming for (parameterizing it, e.g., by the mean and the variance you want). In R:

original <- runif(1e6)
hist(original)

original

n.sample <- 1e4
mm <- 0.3
sdev <- 0.2

weights <- dnorm(original,mean=mm,sd=sdev)
resample <- sample(original,size=n.sample,prob=weights)
hist(resample)

subsample

This will work with a general (non-uniform) original sample, and also with a general (non-normal, general parameters) target density.

Stephan Kolassa
  • 95,027
  • 13
  • 197
  • 357