DNS resolution can sometimes return one of multiple IP addresses, for load balancing. I would like to enumerate a list of IPs for a service so I can whitelist traffic to a domain without performing an excessive amount of reverse lookups. How many times should I receive a repeat record before I stop, to have a high probability of enumerating the whole collection?
More formally, I have a fixed set of unknown cardinality and can select only randomly (assume an equal probability of each element being returned). How should I compute when to stop sampling?
There should be a formula with a tuneable confidence level, but I've not found it by searching yet. I appear to be searching for the wrong kinds of things ("unknown sample size, how many samples to saturate", "enumerate set unknown cardinality", etc). Set enumeration via random selection seems to me a fairly general problem.