1

Based on guidance provided below I have revised my question.

How would I calculate a 95% CI for the mean of a beta-binomial distribution that ranges between 0 and 5 and can only have values that are a multiple of 0.5? Any guidance would be much appreciated?

Is bootstrapping an appropriate method for determining a confidence interval for this data?

My data are body condition scores recorded by a veterinarian. I have two sets of data (set1 and set2).

set1 <- as.data.frame(c(3,3,2.5,2.5,4.5,3,2,4,3,3.5,3.5,2.5,3,3,3.5,3,,3,4,3.5,3.5,4,3.5,3.5,4,3.5)
colnames(set2) <- "numbers"

set2 <- as.data.frame(c(2.5,4,5,4,5,5,5,5,5)
colnames(set2) <- "numbers"

The most similar question I have found is here

Pat Taggart
  • 677
  • 2
  • 9
  • Calculation of a confidence interval implies that you know the distribution of the random variable. Your calculations assumed a Normal distribution It doesn't give sensible answers for a discrete distribution with a limited range especially for results that are clustered at the top of the range for one of your sets. It's not clear that you have yet understood that. Once you have the proper concepts and terms you should be able to do a better job of searching for packages and functions that will provide the needed calculations. Voting to migrate. – DWin Dec 30 '19 at 04:56
  • 1
    Similar question to: https://stats.stackexchange.com/questions/426968/how-can-i-handle-data-where-the-sampling-distribution-exceeds-the-range-of-the-d/426996#426996 – Dave2e Dec 30 '19 at 18:02

2 Answers2

1

One approach is to use bootstrapping:

library(simpleboot)
library(boot)

set1 <- as.data.frame(c(3,3,2.5,2.5,4.5,3,2,4,3,3.5,3.5,2.5,3,3,3.5,3,3,4,3.5,3.5,4,3.5,3.5,4,3.5))
colnames(set1) <- "numbers"

set1.boot = one.boot(set1$numbers, mean, R=10^4)
## hist(set1.boot)
boot.ci(set1.boot, type="bca")
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 10000 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = set1.boot, type = "bca")
## 
## Intervals : 
## Level       BCa          
## 95%   ( 3.04,  3.48 )  
## Calculations and Intervals on Original Scale

set2 <- as.data.frame(c(2.5,4,5,4,5,5,5,5,5))
colnames(set2) <- "numbers"

set2.boot = one.boot(set2$numbers, mean, R=10^4)
## hist(set2.boot)
boot.ci(set2.boot, type="bca")
## BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
## Based on 10000 bootstrap replicates
## 
## CALL : 
## boot.ci(boot.out = set2.boot, type = "bca")
## 
## Intervals : 
## Level       BCa          
## 95%   ( 3.611,  4.889 )  
## Calculations and Intervals on Original Scale
BigFinger
  • 111
  • 1
  • 1
    Shouldn't we have concerns about estimates of a 95% CI from a dataset with only 10 instances. – DWin Dec 30 '19 at 05:37
1

I think you might make progress by asking your audience to assume that these values are distributed on the range [0,5] in the set {(0:10)/2} with a beta-binomial distribution. The beta-binomial distribution arose from a different process than your situation but it is an ordered discrete distribution.

Ben Bolker has a nice discussion of simulation using the beta-binomial with examples of estimating parameters from data using R and his bbmle package: https://cran.r-project.org/web/packages/bbmle/vignettes/mle2.pdf . (The name of the package is not from an initialism of beta-binomial but rather from his name.)

DWin
  • 7,005
  • 17
  • 32