1

I am using a Weibull distribution in R, and know that:

E(X) = 1000 and Var(X) = 500,000

Knowing:

E(X^r) = ($\Gamma$(1+ (r/$\gamma$))) * 1/c^(r/$\gamma$))

I found the following parameters using the uniroot functions in R and obtained:

c = 0.00004298545546 and $\gamma$ = 1.435504

However entering the following code in R:

rweibull(100, shape = 1.435504, scale = 1/0.00004298546))

and computing its mean:

mean(rweibull(100, shape = 1.435504, scale = 1/0.00004298546))) 

gives: 19585.5 and not a value close to 1000 as expected.

Can anybody help?

Ferdi
  • 4,882
  • 7
  • 42
  • 62

2 Answers2

6

TL;DR - your mean formula is not correct.

There's several (two?) commonly used parameterizations of Weibull distribution. Your formula refers to either that, or maybe some generalized Weibull with a third parameter. Following R parameterization, the formula is simply $$ E(X) = b \Gamma(1 + 1/a)$$ (with $a$ - shape, $b$ - scale, from R's help or Wiki). Plugging your values I get 21123.48, which is close to your simulated sample mean.

juod
  • 2,192
  • 11
  • 20
  • 1
    As a comment: increasing sample size for simulation would make it even closer. – Tim Aug 08 '18 at 07:12
2

I think you may have a mistake in your transcription of the function as well. More than that, you say you used uniroot. That is a one-dimensional root finder. The Weibull is a two-parameter distribution, and it must be optimized over both parameters simultaneously. As a rough analogy, you are restricting yourself to moving North-South only or East-West only along the two-dimensional parameter surface. You have to follow the shape of the surface, even diagonally, to find the minimum. I'll provide an example of simultaneously fitting both shape and scale.

Using the Klugmnan-Panjer-Wilmott (KPW) notation (standard notation for US actuaries), the Weibull has:

$$ \begin{align} f(x) &= \frac{\tau\left(\frac{x}{\theta}\right)^\tau e^{-\left(\frac{x}{\theta}\right)^\tau}}{x}\\ F(x) &= 1-e^{-\left(\frac{x}{\theta}\right)^\tau}\\ E\left(X^r\right) &= \theta^r\cdot\Gamma\left(1+\frac{r}{\tau}\right),\quad k \gt -\tau \end{align} $$ Given that, one can use a numerical optimization routine, as you did, to find $r, \theta$.

set.seed(716)
library(nloptr)

SqErr <- function(par, Mean, Var){
  t <- par[[1]]
  q <- par[[2]]
  Mu <- q * gamma(1 + 1 / t)
  SigSq <- q ^ 2 * gamma(1 + 2 / t)
  ParVar <- SigSq - Mu ^ 2
  return((Mu - Mean) ^ 2 + (ParVar - Var) ^ 2)
}

A <- nloptr(c(2, 1), SqErr, Mean = 1000, Var = 500000,
        opts = list(algorithm = "NLOPT_LN_NELDERMEAD",
                    maxeval = 1e5, ftol = 1e-12))

print(A$solution)
[1]    1.435442 1101.254610
B <- rweibull(1e5, A$solution[[1]], scale = A$solution[[2]])
mean(B); var(B)
[1] 1000.568
[1] 499458.7
Avraham
  • 3,182
  • 21
  • 40
  • 1
    +1. Note that $\theta$ is easily found (algebraically) once you know $\tau$ and that $\tau$ is the positive root of a function $\Gamma(1+1/\tau)^2/\Gamma(1+2\tau) - C = f(\tau)-C$ with $0\lt C \lt 1.$ Its graph is quite flat for both large and small $\tau,$ making root-finding difficult in those cases. It is nearly linearized (except for extremely small $\tau$) by considering instead roots of $(1-f(\tau))^{-1/2}-(1-C)^{-1/2}.$ That should give you nearly perfect accuracy. – whuber Dec 13 '18 at 00:32