We have a v. large (1e6) population with unknown number of types of items. We draw a small sample (~100) of a certain size, and find that exactly one item was duplicated. The question is to estimate the number of types of items (diversity) in the infinite population.
Moreover, the distribution of types of items might match an exponential distribution (with one item being the most frequent), with unknown parameters.
I know this is v. little to work with.
My knee-jerk reflex would be to run a kind of a Monte Carlo simulation for different parameters, and use a maximum likelihood estimate.
However, do you think there is an analytical solution?