0

So its like saying 95% of the items sold in this website costs between $$25 - $150. While some items might cost less than $25 and other items might cost more than $150.

Is there a way to find this? Is this something related to CI - confidence Interval?

Prabhu M
  • 21
  • 2
  • What do you mean by 'the minimum range'? – Macro Mar 22 '12 at 19:24
  • See [Reference range](http://en.wikipedia.org/wiki/Reference_range) on Wikipedia. – onestop Mar 22 '12 at 20:08
  • What data or related information do you have to begin with? – whuber Mar 22 '12 at 20:14
  • Weakly related: http://stats.stackexchange.com/questions/24588/quantile-intervals-vs-highest-posterior-density-intervals/ – Henry Mar 22 '12 at 21:52
  • Does the result need to be an actual value, or can it be an interpolated value? – Michelle Mar 22 '12 at 23:44
  • 1
    This sounds like a homework question. It has to do with standard deviation which is used to calculate confidence intervals. I would say more but in the interest of facilitating your learning I'll direct you to the wikipedia page on standard deviation instead. http://en.wikipedia.org/wiki/Standard_deviation I'd be happy to discuss this further but please read that, make an attempt at solving the problem, and then ask some intelligent questions and we can go from there. – Will Mar 23 '12 at 02:30
  • values can be either actual or interpolated. – Prabhu M Mar 23 '12 at 03:47
  • Yes, i can consider this as a homework from work :-) there is a revenue data that i am analyzing. there are outliers on the higher range & lower range due to human data entry errors. so i was planning to trim the top 2.5% observations & bottom 2.5 % observations to eliminate outliers from my analysis. i am new to data analysis & statistics, hence this noob question. – Prabhu M Mar 23 '12 at 04:02

1 Answers1

2

If you mean the narrowest range then something like this would do it

set.seed(1)
dat        <- rnorm(1000000)
ordereddat <- sort(dat)
width      <- ceiling(length(ordereddat) * 0.95)
difdat     <- diff(ordereddat, lag = width)
min        <- which(difdat == min(difdat)) 
c(ordereddat[min[1]] , ordereddat[min[1] + width])

producing

[1] -1.962899  1.961126

If you want as many items above as below then there is the simple but slightly wider

> c(quantile(dat, 0.025),  quantile(dat, 0.975))
     2.5%     97.5% 
-1.960232  1.964565  
Henry
  • 30,848
  • 1
  • 63
  • 107