9

I have a problem for which I believe I should use the hypergeometric distribution, but I can't figure out how to do it in R.

Say I have a bag of marbles with known number ($N$) of marbles, but the number of successes (white marbles) in the bag ($K$) is unknown. I want to infer about K.

Given a sample from this population, where I see $n$ trials and $k$ successes, how do I infer something about the population $K$?

Ideally, I'd like to construct a prior distribution around $K$ and then use the sample to update it and get a Bayesian posterior credibility interval around $K$ (for a given credibility score $\alpha$), but I am struggling to complete this practically. I have read that the conjugate prior for the hypergeometric is the beta-binomial. I thought maybe there would be an R function or package that could take the prior parameters, and then update with a sample to give me a credibility interval, but haven't been able to find it.

If the Bayesian setting is difficult, perhaps a confidence interval would suffice... Can anyone either point me to some R functions or tutorials, or some resources that could help? Thanks.

EDIT: To add an example, I can do this in the case of the binomial distribution, to infer $p$, given a sample. I can construct a credibility interval with the binom package

k= 15
n= 25
library(binom)
binom.bayes(k, n, conf.level = .95, tol=.005, type="central")
# method  x  n shape1 shape2      mean     lower     upper  sig
#  bayes 15 25   15.5   10.5 0.5961538 0.4057793 0.7725105 0.05

To add a prior, because of the way updates work with the beta-binomial, I can just add counts to the $k$ and $n$ parameters according to the $a$ and $b$ parameters of the prior beta distribution.

In the beta-binomial example, $N$ is infinite, and I'm inferring $p$. What I want to do is take this exact situation and just extend it to the case where $N$ is finite (and known), and infer $K$ (which would be equivalent to inferring $p$). This changes the binomial to the hypergeometric.

nsheff
  • 191
  • 4

1 Answers1

-1

You can solve your problem using the Maximum Likelihood method rather than the Bayesian method. Just calculate the hypergeometric probability of finding k for a large number of values of K using your known values of k, N and n. The value of K that provides the maximum for this probability is the expected value of K.

Phil
  • 1