2

I just learned about bootstrapping as a method for dealing with small samples (n<30), which is a major issue with my bioarchaeological data. Here is my code and output for bootstrapping a sample proportion (n=3) to get a 95% CI. Did I do it right?

> CrSA<-c(0,1,1)<br>
> CrSAmean<- function(x, d) {return(mean(x[d])) }
> boot(data=CrSA, statistic=CrSAmean, R=500)

ORDINARY NONPARAMETRIC BOOTSTRAP

Call:
boot(data = CrSA, statistic = CrSAmean, R = 500)

Bootstrap Statistics :
     original      bias    std. error
t1* 0.6666667 -0.01533333   0.2807337

> boot.mean<-boot(data=CrSA,statistic=CrSAmean,R=500)
> boot.ci(boot.out=boot.mean,type="norm")
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 500 bootstrap replicates

CALL : 
boot.ci(boot.out = boot.mean, type = "norm")

Intervals : 
Level      Normal        
95%   ( 0.1284,  1.2116 )  

Calculations and Intervals on Original Scale

  • 1
    The justification for the bootstrap is asmptotic. In small samples, it can exhibit problematic behavior and is therefore not a "remedy" for small samples. But it may be fine in your specific case. For more on this topic, see [here](https://stats.stackexchange.com/questions/112147/can-bootstrap-be-seen-as-a-cure-for-the-small-sample-size) or [here](https://stats.stackexchange.com/a/209485/21054). – COOLSerdash Dec 02 '20 at 13:37

1 Answers1

0

If I understand correctly, you want to estimate the proportion of "successes" from $n=3$ trials. With such a small sample size, I'm not sure how bootstrap or any non-parametric method can be useful... What if, for example, your vector CrSA is [0, 0, 0] or [1, 1, 1]?

In my opinion, it would be better to ask which range of proportions is compatible with the observations assuming your vector of 1 and 0s come from a binomial distribution. For this you could use the binom.test function in R:

# 0 successes out of 3 trials:
binom.test(x= 0, n= 3)$conf.int
[1] 0.0000000 0.7075982
attr(,"conf.level")
[1] 0.95

# 1 success; 3 trials:
binom.test(x= 1, n= 3)$conf.int
[1] 0.008403759 0.905700676
attr(,"conf.level")
[1] 0.95

But obviously, with $n=3$ you get a very broad range.

dariober
  • 2,805
  • 11
  • 14
  • Question for dariober: I though I read somewhere that the binomial distribution was only valid for n>= 5; doesn't using the binom.test violate this 'requirement'? – stevebyers2000 Dec 02 '20 at 14:39
  • @stevebyers2000 the binomial distribution is defined ("valid") for any integer $n$ (number of trials) $\geq 0$ so $n=3$ is ok. Note also that I'm using `binom.test` only as a shortcut to get the CI of the proportion, I'm not testing any hypothesis. – dariober Dec 02 '20 at 14:50
  • Thank you, dariober. And thank you, COOLSerdash. I know now how to proceed. – stevebyers2000 Dec 02 '20 at 14:56