How to find the minimum range(start/end) which covers 95% of items in a numerical list?

Question

So its like saying 95% of the items sold in this website costs between $$25 - $150. While some items might cost less than $25 and other items might cost more than $150.

Is there a way to find this? Is this something related to CI - confidence Interval?

See [Reference range](http://en.wikipedia.org/wiki/Reference_range) on Wikipedia. — onestop, Mar 22 '12 at 20:08
Weakly related: http://stats.stackexchange.com/questions/24588/quantile-intervals-vs-highest-posterior-density-intervals/ — Henry, Mar 22 '12 at 21:52
Does the result need to be an actual value, or can it be an interpolated value? — Michelle, Mar 22 '12 at 23:44
This sounds like a homework question. It has to do with standard deviation which is used to calculate confidence intervals. I would say more but in the interest of facilitating your learning I'll direct you to the wikipedia page on standard deviation instead. http://en.wikipedia.org/wiki/Standard_deviation I'd be happy to discuss this further but please read that, make an attempt at solving the problem, and then ask some intelligent questions and we can go from there. — Will, Mar 23 '12 at 02:30
Yes, i can consider this as a homework from work :-) there is a revenue data that i am analyzing. there are outliers on the higher range & lower range due to human data entry errors. so i was planning to trim the top 2.5% observations & bottom 2.5 % observations to eliminate outliers from my analysis. i am new to data analysis & statistics, hence this noob question. — Prabhu M, Mar 23 '12 at 04:02

Henry · Accepted Answer · 2012-03-22T19:34:15.537

2

If you mean the narrowest range then something like this would do it

set.seed(1)
dat        <- rnorm(1000000)
ordereddat <- sort(dat)
width      <- ceiling(length(ordereddat) * 0.95)
difdat     <- diff(ordereddat, lag = width)
min        <- which(difdat == min(difdat)) 
c(ordereddat[min[1]] , ordereddat[min[1] + width])

producing

[1] -1.962899  1.961126

If you want as many items above as below then there is the simple but slightly wider

> c(quantile(dat, 0.025),  quantile(dat, 0.975))
     2.5%     97.5% 
-1.960232  1.964565

edited Mar 22 '12 at 19:34

answered Mar 22 '12 at 19:26

Henry

30,848
1
63
107

i think c(quantile(dat, 0.025), quantile(dat, 0.975)) fits what i was looking for. thank you – Prabhu M Mar 23 '12 at 04:03

How to find the minimum range(start/end) which covers 95% of items in a numerical list?

1 Answers1

Linked