1

I want to randomly generate 1000 normal variates (using rnorm, e.g.) that have mean 100. 25% of the 1000 numbers should be over 110.

How can I do this in R?

I only got this far:

x <- rnorm(1000,100,1) 
whuber
  • 281,159
  • 54
  • 637
  • 1,101
rlost
  • 11
  • 1
  • 3
  • 4
    Do you need to generate according to a distribution whose mean is $100$ or do you need the mean of the generated values to equal $100$? (The two are different!) I also ask for similar clarification concerning the proportion over $110$. – whuber Dec 05 '12 at 22:11
  • I need to generate according to a distribution whose mean is 100. – rlost Dec 05 '12 at 22:46
  • ...and whose upper quartile is $110$? – whuber Dec 05 '12 at 22:50
  • 2
    If you're talking about generating from a distribution while constraining the sample quantities, [this thread](http://stats.stackexchange.com/questions/30303/how-to-simulate-data-that-satisfy-specific-constraints-such-as-having-specific-m) will be of interest. If you're only talking about simulating normals for specific values of of $\mu$ and the 75th quantile, some careful thinking about the [normal quantile function](http://en.wikipedia.org/wiki/Normal_distribution#Quantile_function), which can be calculated in `R` with `qnorm`, and what a proper multiplier would be, will solve your problem. – Macro Dec 05 '12 at 22:59

2 Answers2

5

Just like mentioned in comments, we have the quantile function

$F^{-1}(p;\,\mu,\sigma^2) = \mu + \sigma\Phi^{-1}(p) = \mu + \sigma\sqrt2\operatorname{erf}^{-1}(2p - 1), \quad p\in(0,1)$

in this case

$110=F^{-1}(0.75;\,100,\sigma^2) = 100 + \sigma\Phi^{-1}(0.75)$

So $\sigma$ is all we need:

sd <- 10 / qnorm(0.75)
quantile(rnorm(10000, mean = 100, sd = sd), 0.75)
     75% 
110.0221 
Julius Vainora
  • 1,007
  • 1
  • 10
  • 14
1

You can draw random numbers until you hit a distribution you like:

while ( TRUE ) {
  x <- rnorm(1000,100,1)
  if ( sum(x>110) > 25 ) break
}

However, note that you will usually only expect an infinitesimal number of your values to be more than ten standard deviations above the mean, so you will have to wait quite a bit... and the result will be so atypical that I would hesitate to label it "a set of normally distributed random numbers of which 25% just happened to be larger than 110".

Stephan Kolassa
  • 95,027
  • 13
  • 197
  • 357
  • 4
    I suspect the point of this question is to determine an appropriate standard deviation for the distribution. :-) – whuber Dec 05 '12 at 22:12
  • Ok, thank you :). I'm gonna try with different standard deviations. – rlost Dec 05 '12 at 22:57