1) I know that my data is NB distributed. But also I know that it has several outliers (probably, zeros and extremely big numbers). How can I estimate NB?
I found the trick answered here, on CV site, and I agree that it is reasonable, but are there more efficient strategies? For example, for some NB simulations too many points do not satisfy the criteria $\sqrt{(x \in X)} > med(\sqrt{(X)}) + 3$. Can I use package alphaOutlier in an iterative way to exclude outliers? Actually, it works...Is there something REALLY wrong?
install.packages("alphaOutlier")
library("alphaOutlier")
s=100
m=100
vect <- rnbinom(100, size=s, mu=m)
vect <- c(vect, 1000)
vect <- c(vect, 1000)
vect <- c(vect, 1)
vect <- c(vect, 1)
distr <- fitdistr(vect, "Negative Binomial")
estim_s <- distr$estimate["size"]
estim_m <- distr$estimate["mu"]
out <- aout.nbinom(vect, param=c(estim_s, estim_s / (estim_s + estim_m)), alpha=0.05 / length(vect))
while(length(which(out$is.outlier == "TRUE")) > 0) {
vect <- vect[which(out$is.outlier == "FALSE")]
distr <- fitdistr(vect, "Negative Binomial")
estim_s <- distr$estimate["size"]
estim_m <- distr$estimate["mu"]
out <- aout.nbinom(vect, param=c(estim_s, estim_s / (estim_s + estim_m)), alpha=1 / length(vect))
}
out
Solved: 2) I know that $\xi \sim NB(r,p)$. Can I say something about $r,p$ of ``$2 \xi$'', i.e., when the process has double intensity? I mean, for Poisson I can make some conclusions of $Pois(2 \lambda)$ when I have $Pois(\lambda)$. Can I do the same trick for $NB$?