Questions tagged [capture-mark-recapture]

capture-mark-recapture is a statistical sampling method to estimate an unknown population size by taking two independent samples and gauging the degree of overlap between them.

A typical example of a capture-recapture method is as follows. Suppose we want to estimate the number of fish (of a given species, probably) in a lake. At some point in time, and at some location on the lake, we take a sample of fish (say $n_1=500$), mark them, and let them go. At a later point in time, and probably in a different location, we take another $n_2=500$ fish and count how many of them were in the original catch. If we found say $k_2 =20$ marked fish in the second sample, then our estimate of the total number of fish is $n_1 n_2 / k_2 = 12500$. More explicitly:

\begin{align} \frac{k_2}{n_2} &= \frac{n_1}{N} \\ \frac{20}{500} &= \frac{500}{N} \\ N &= \frac{500\times 500}{20} \\ N &= 12500 \end{align}

Capture-recapture wants both samples to be completely random because you expect that marked fish mingle with others in the closed volume. In other words, randomness of the two samples is comprised of (i) taking the samples of fish in randomly selected portions of the lake; (ii) relying on natural biological mixing processes to re-distribute the marked fish around. See Thompson (1992, Ch. 18).

Similar techniques have been used in other contexts, including analysis of networks (Warren, Airoldi and Banks 2008), hard to reach populations (Handcock, Gile and Mar 2013) and literary studies (Efron and Thisted 1976).

Efron, B. and Thisted, R. (1976). Estimating the Number of Unsen Species: How Many Words Did Shakespeare Know? Biometrika, Vol. 63, No. 3, pp. 435-447. http://www.jstor.org/stable/2335721

Handcock, M.S. Gile, K. and Mar, C. (2013) Estimating Hard-to-Reach Population Size Using Respondent-Driven Sampling Data. http://paa2013.princeton.edu/abstracts/130169

Thompson, S. K. (1992) Sampling. Wiley, New York.

Warren, R., Airoldi, E. and Banks, D. (2008). Network Analysis of Wikipedia. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.131.4966

32 questions
9
votes
1 answer

Estimating number of balls by successively selecting a ball and marking it

Lets say I have N balls in a bag. On my first draw, I mark the ball and replace it in the bag. On my second draw, if I pick up a marked ball I return it to the bag. If, however I pick up a non-marked ball then I mark it and return it to the bag. …
8
votes
1 answer

Capture-recapture sampling valid in literary analysis?

So I've been compiling a list of fictional (anime/manga) characters which meet a certain criteria (http://www.gwern.net/hafu#list) from a universe of all anime/manga characters since 1963 (which is of unknown total size - but very large!), and I've…
gwern
  • 405
  • 3
  • 15
8
votes
1 answer

Predicting total number of bugs based on number of bugs revealed by each tester

Assuming n testers were independently testing the same application for a given period. Each tester found a given set of bugs (Some of the bugs were detected by more than one tester). For example: Tester 1 found bugs {1,2,3,4,5} Tester 2 found bugs…
Lior Kogan
  • 380
  • 1
  • 11
7
votes
2 answers

Estimate the number of common members in two populations

Suppose population A has $N_1$ members, and population B has $N_2$ members. There are $K$ common members in the populations. We want to estimate $K$. Draw a sample of size $n$ from each population. Through comparing them, we find that these two…
Recomo
  • 71
  • 2
7
votes
1 answer

Estimating Size of a Set based on two Overlapping Subsets

I've searched everywhere for a similar question and many things come close but are not the same. I'm looking for a way to estimate the size of a set if two partially overlapping subsets are known (assuming both subsets were selected at…
5
votes
1 answer

Survival analysis when exact time to event unknown?

My dataset (example here) represents a long-term capture-mark-recapture study, approximately 20 years duration. I am interested in looking at how the survival of animals is influenced by their sex and exposure to viral pathogens. I have data on the…
4
votes
1 answer

Estimation/simulation of homing with time effect

I have birds translocated from site A (original, capture site) to site B (new, release site) and I want to analyse homing behaviour. Translocation of individuals was performed continuously (several weeks), thus a bird released at the beginning of…
4
votes
0 answers

Estimating repeat shoppers from an incomplete sampling

I'm trying to estimate how many people visited the farmers market once, twice, thrice, etc. in a given time period, using sampled data. We have interview data from approximately 50% of visitors as they entered the market which lets us identify them…
Jonathan
  • 1,283
  • 8
  • 15
4
votes
1 answer

Repeated catch–mark–release (urn problem)

Imagine I have a small pond with some fish in it and I want to estimate its population size. I don't have a lot of resources at hand, in fact all I do have is a fishing pole, pen and paper, and a lot of patience. One morning I sit down at the end…
AkselA
  • 326
  • 1
  • 12
3
votes
1 answer

Contaminated mark-recapture: estimating set size from sampled subsets

Someone poured marked balls in my urn! Simplistically, I think this is a capture-recapture problem where, after drawing and marking balls from the urn, somebody added an unknown number (approx 25% of the original draw) of marked balls before the…
2
votes
0 answers

Estimating population size and a proportion

I am interested in the following sampling problem, which I will try to describe by a motivating example. Suppose we want to estimate how many people in a certain area, has blue eyes, how many have them has brown eyes etc.(think of a color scale) We…
2
votes
1 answer

What does "return rate" mean in population ecology models?

What does "return rate" (tasa de retorno in spanish) mean in population ecology models? In context of capture-mark-recapture models. I have found many articles on this, but no definition! I've also found the term "recapture rate". What is it? Is it…
Tomas
  • 5,735
  • 11
  • 52
  • 93
2
votes
1 answer

Estimating Size of a Set based on three or more Overlapping Subsets

What's the solution to the capture-recapture problem with 3 or more overlapping subsets, not just 2, as in the standard version of the problem?
Jessica
  • 1,019
  • 7
  • 14
2
votes
1 answer

Hypergeometric distribution when K is unknown

The probability to have $k$ white balls in a sample of size $n$ taken from an urn of $N$ balls with $K$ of them being white is equal to: $$ P(k|n,N,K) = \frac{{{n}\choose{k}}{{N-n}\choose{K-k}}}{{{N}\choose{n}}} $$ How to infer a probability when…
2
votes
0 answers

Tradeoff between Survival probability and detection probability: Mark-recapture

When scientists are using mark-recapture models on an open population model to estimate the survival probability and the recapture probability (also known as "detection"), how can we be sure that the model is estimating the right thing between the…
M. Beausoleil
  • 941
  • 3
  • 10
  • 23
1
2 3