0

The commonly available spreadsheet function PERCENTILE / PERCENTILE.INC is described as giving the alpha quantile of a sample.

The 95th Percentile is described as the value below which 95% of the samples fall - [Wikipedia] - please correct it if wrong1.

This is commonly employed in "95th Percentile Billing" for services, which claims to "ignore sporadic usage spikes" - and is offered as a fairer (and cheaper) alternative to paying for a fixed level, but companies employ different methods of calculating it - some use p, and some 1-p. 1) Which is technically correct? p or 1-p?

Say I have a series of usage samples indicating transactions per minute of:

A1. 20
A2. 20
A3. 20
A4. 100
A5. 20

It's clear that 80% of the samples are 20 or below, ignoring the spike of 100. The function PERCENTILE(A1:A5,1-0.8)=20, which is what the lay customer would expect, whereas PERCENTILE(A1:A5,0.8)=36, which seems to not ignore this usage spike.

Also, consider the example:

A1. 20
A2. 20
A3. 100
A4. 100
A5. 20

PERCENTILE(A1:A5,0.8)=100 and PERCENTILE(A1:A5,0.95)=100 - which seems to clearly be the "upper 5%" while PERCENTILE(A1:A5,0.2)=20 seems more intuitively correct, and PERCENTILE(A1:A5,0.05)=15 defeating intuition as its lower than any value in the range.

2) How can I explain this lower value, to people who do not have a mathematical background - including myself? 3) Is either p or 1-p a technically correct value for p being the "lowest 95 percentile"? 4) Is there a formula that would most accurately average the lower 95% of samples? What is it called?

This could easily be a $1 000 000 / year question.

Dagelf
  • 101
  • 3
  • @whuber I read that question and its answers, and I still don't know how to answer my own questions... am I just stupid? Shouldn't I come here for help? – Dagelf Dec 03 '19 at 11:02
  • We can help if you would edit your question to clarify how it differs from the apparent duplicate: in other words, focus the question on what you are seeking to find out. – whuber Dec 03 '19 at 14:38

0 Answers0