I have a population of entities associated with different categories, say
blue 50000
red 300
green 80
yellow 10
pink 6
orange 3
white 2
The distribution is known up the exact counts. It is very homogeneous, i.e. if ${p_1,...,p_k}$ denotes the probabilities for each category, then $max_i(p_i)\geq0.95$.
Now I would like to choose a sample size, such that the expected number of categories $N_k$ present in the sample is $n_k$, i.e. the expected number of categories whose count is larger than zero is $n_k$. How, given a count vector as shown above and $n_k$, do I choose the sample size?