Fun fact: the 0.632 quantile of the distribution is $\lambda$ (without regard to $k$), so one way to estimate $\lambda$ is the sample 0.632 quantile:
wd <- scan()
1: 604 104 224 200 1444 1076 1308 6084 468 2308.
11:
Read 10 items
lam0 <- quantile(wd,p=.632)
lam0
63.2%
1235.616
Another fun fact is the basis of the Weibull plot:
$\ln(-\ln(1- F(x))) = k\ln(x) -k\ln(\lambda)$
If we look at approximating that relationship by sample values:
$\ln(-\ln(1-\hat F(x)))= \hat{k}_0\ln(x) -\hat{k}_0\ln(\hat\lambda_0)$
That is a plot of $\ln(-\ln(1-\hat F(x)))$ vs $\ln(x)$ should look linear, and the slope should be a reasonable estimate of $k$. One could use regression to estimate that. However, when working with data, the right hand side is the random variable and the left fixed, so it's common to recast it so that $\ln(x)$ is regressed on $\ln(-\ln(1-\hat F(x)))$, in which case the reciprocal of the slope is an estimate of the shape parameter (conveniently, exponentiating the intercept is then an estimate of the scale).
For the values taken by $\hat F$ for such a plot it would be common to use $\frac{i-\alpha}{n+1-2\alpha}$ for $\hat F(x_{(i)})$ where $x_{(i)}$ is the $i$th smallest observation (just as for normal scores plots). The Wikipedia page suggests $\alpha=0.3$ but the default in R's ppoints
should be fine:
ws <- sort(wd)
Fh <- ppoints(ws)
k0 <- lm(log(-log(1-Fh))~log(ws))$coefficients[2]
(you can avoid the need to sort by using the same idea as in stats:::qqnorm.default
-- it avoids explicitly rearranging the data by using ppoints(n)[order(order(y))]
to get the theoretical quantiles in the same order as the data -- but I think the above is clearer)
- You could alternatively try to get $\lambda$ out of the intercept from that straight line fit
These sorts of approaches are common and usually sufficient.
However, with your particular data there are several issues. The first is simply one of scale. If you rescale you can avoid the first problem:
fitdistr(w/10, densfun="weibull",start=list(shape=k0,scale=lam0/10))
shape scale
0.8912185 128.0091222
( 0.2104516) ( 47.7940515)
Note that here I am using the inbuilt weibull function rather than yours. It is a teeny bit more numerically stable (but not by much by the look of things).
Note that now the scale parameter is here one tenth of the actual scale.
So that looked like it converged, right? Well, firstly there were warnings and we had to scale it to get it to fit at all, so it's behaving pretty badly -- we're not at all sure it did converge. We may need to play with the convergence criteria.
If you instead rescale a bit smaller it will even produce an answer without a starting value, but we still get warnings so we're again not so sure it converged properly:
fitdistr(w/100, densfun="weibull")
shape scale
0.8941211 12.9974798
( 0.2109907) ( 4.8739202)
Fortunately, we can do better still -- the survreg
function in survival
(which comes with R but isn't loaded by default) can fit Weibull regression models, so that's a way to get parameter estimates.
The model is parameterized differently but that presents no problem, we just need to convert to the required parameterization:
library(survival)
ones <- array(1,length(w)) # a vector of 1's
Sw <- Surv(w,ones) # set up a survival object with
# all "times" in w uncensored
sro <- survreg(Sw~1)$icoef # fit an intercept-only model
#(i.e. the same distribution to every data point)
# ... no warnings are issued here
lam0 <- exp(sro[1]) # pull out the parameters and
k0 <- exp(-sro[2]) # transform to our parameterization
c(k0,lam0) # take a look at the values, but ignore the names
Log(scale) (Intercept)
0.8941177 1299.7480322
fitdistr(w/100, densfun="weibull",start=list(shape=k0,scale=lam0/100))
shape scale
0.8941177 12.9974803
( 0.2109901) ( 4.8739411)
That converges to the same values (up to the scale factor), again without warnings. So now scaling the values back, we have:
shape scale
0.8941177 1299.74803
( 0.2109901) ( 487.39411)
A bit of a roundabout way to do it. So it turns out that those values we had that we weren't sure had converged, essentially had.
Changing the options such as the convergence criteria that fitdistr
supplies to optim
may well solve the problem more easily than this, but I find survfit
often gets there on the Weibull when fitdistr
has trouble.