2

I am auditing insurance claims for fraud. In my random sample, all the claims are fraudulent. Could I say for the population that the estimated proportion of fraud is 1? I know how to compute a confidence interval when p=1. But what should be the point estimate?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Red Rock
  • 21
  • 1
  • 2
    The point estimate would be 1, yes. But the confidence interval is more informative. – COOLSerdash Apr 21 '17 at 18:23
  • Unfortunately, when the sample proportion is 1 (or 0), the estimated standard error will be equal to zero. Thus, your confidence interval is going to be [1, 1] which is totally unhelpful. As an alternative, you could look to Bayesian estimation. But in all honesty, it sounds like your sample is too small or inproperly collected. It will be hard to make useful inference here. – jjet Apr 21 '17 at 18:30
  • 4
    @jjet You seem to have in mind an invalid confidence interval procedure, because there are perfectly good procedures available. The "Rule of Three" works very well in this case: it puts a lower confidence limit at $1-3/n$ for a sample size of $n$ (assuming it's a representative sample and the data can be considered independent, of course). See https://en.wikipedia.org/wiki/Rule_of_three_(statistics). For a very closely related question, please see https://stats.stackexchange.com/questions/274855 (along with the answers and comments). – whuber Apr 21 '17 at 18:38
  • There's always exact confidence intervals and median unbiased estimates. – Björn Apr 21 '17 at 18:38
  • @COOL Your use of "the" might suggest to some that there is only one possible estimate. In fact, *many* valid estimates are available. A wide array of them can be obtained by modeling this as Binomial sampling and adopting a Beta$(\alpha,\beta)$ prior (the conjugate prior). For quadratic loss, its point estimate is $\hat{p} = (n+\alpha)/(n+\alpha+\beta)$, which will be less than $1$. – whuber Apr 21 '17 at 18:41
  • The rule of three is pretty neat. I'd never seen that before. From a practical standpoint though, I suspect that any CI method could lead OP astray. If say, n=5, and the data weren't randomly selected, then is there really any point in doing statistics? I think more info is needed before a point estimate/CI can be implemented. – jjet Apr 21 '17 at 19:02
  • @whuber Sure! I was referring to the "usual" $x/p$ point estimate. I should have replaced "the" by "one among many". Thanks for pointing it out. – COOLSerdash Apr 21 '17 at 19:12
  • Check also https://stats.stackexchange.com/questions/134380/how-to-tell-the-probability-of-failure-if-there-were-no-failures/190573 – Tim Apr 21 '17 at 21:48

0 Answers0