Programming inverse-transformation sampling for Pareto distribution

Question

I am having trouble deriving a formula, and running a simulation with its distribution. The Pareto distribution has CDF:

$$F(x) = 1 - \bigg( \frac{k}{x} \bigg)^\gamma \quad \quad \quad \text{for } x \geqslant k,$$

where $k>0$ is the scale parameter and $\gamma>0$ is the shape parameter. I need to derive the probability inverse transformation 'quantile':

$$X = F^{-1}(U) = Q(U).$$

I tried deriving the equation and ended up with $X = k/\text{gammaroot}(1-U)$. Does that make sense? If so, how would I do a $\text{gammaroot}$ function in R?

Please explain what you mean by "gammaroot." Would it be something like taking the $1/\gamma$ power? — whuber, Nov 21 '20 at 20:35
Yes, that would be. If that was the case, did I derive the equation correctly to obtain X? — John Huang, Nov 21 '20 at 20:49
If you want to write $\sqrt[\gamma]{x}$, use `\sqrt[\gamma]{x}`. — J.G., Nov 22 '20 at 14:40

forecaster · Accepted Answer · 2020-11-21T21:09:10.347

I'm assuming you are referecning to Inverse Transform Sampling method. Its very straight forward. Refer Wiki article and this site.

Pareto CDF is given by: $$ F(X) = 1 -(\frac{k}{x})^\gamma; x\ge k>0 \ and \ \gamma>0 $$

All you do is equate to uniform distribution and solve for $x$

$$ F(X) = U \\ U \sim Uniform(0,1) \\ 1 -(\frac{k}{x})^\gamma = u \\ x = k(1-u)^{-1/\gamma} $$

Now in R:


#N = number of samples
#N = number of sample
rpar <- function(N,g,k){
  
  if (k < 0 | g <0){
    stop("both k and g >0")
  }
  
 k*(1-runif(N))^(-1/g)
}


rand_pareto <- rpar(1e5,5, 16)
hist(rand_pareto, 100, freq = FALSE)

#verify using built in random variate rpareto in package extrDistr
x <- (extraDistr::rpareto(1e5,5, 16))
hist(x, 100, freq = FALSE)

This will give you the random variate for Pareto distribution. I'm not sure about where you are getting gammaroot?

if you think this was helpful, can you go ahead and accept the answer? — forecaster, Nov 21 '20 at 21:05

Ben · Answer 2 · 2020-11-21T21:25:13.690

Using the quantile $p=F(x)$ and inverting the CDF equation gives the quantile function:

$$Q(p) = \frac{k}{(1-p)^{1/\gamma}} \quad \quad \quad \text{for all } 0 \leqslant p \leqslant 1.$$

The corresponding log-quantile function is:

$$\log Q(p) = \log k - \frac{1}{\gamma} \log (1-p) \quad \quad \quad \text{for all } 0 \leqslant p \leqslant 1.$$

The probability functions for the Pareto distribution are already available in R (see e.g., the EnvStats package). However, it is fairly simple to program this function from scratch if preferred. Here is a vectorised version of the function.

qpareto <- function(p, scale, shape = 1, lower.tail = TRUE, log.p = FALSE) {
  
  #Check input p
  if (!is.vector(p))                { stop('Error: p must be a numeric vector') }
  if (!is.numeric(p))               { stop('Error: p must be a numeric vector') }
  if (min(p) < 0)                   { stop('Error: p must be between zero and one') }
  if (max(p) > 1)                   { stop('Error: p must be between zero and one') }
  
  n   <- length(p)
  OUT <- numeric(n)
  for (i in 1:n) { 
    P      <- ifelse(lower.tail, 1-p[i], p[i])
    OUT[i] <- log(scale) - log(P)/shape) }
  
   if (log.p) OUT else exp(OUT) }

Once you have the quantile function it is simple to generate random variables using inverse-transformation sampling. Again, this is already done in the existing Pareto functions in R, but if you want to program it from scratch, that is quite simple to do.

rpareto <- function(n, scale, shape = 1) {
  
  #Check input n
  if (!is.vector(n))                { stop('Error: n must be a single positive integer') }
  if (!is.numeric(n))               { stop('Error: n must be a single positive integer') }
  if (length(n) != 1)               { stop('Error: n must be a single positive integer') }
  if (as.integer(n) != n)           { stop('Error: n must be a single positive integer') }
  if (n <= 0)                       { stop('Error: n must be a single positive integer') }
  
  qpareto(runif(n), scale = scale, shape = shape) }

Programming inverse-transformation sampling for Pareto distribution

2 Answers2

Related