1

Pareto rule states that 20% of records accounts for 80% of total. Actually it's just a special case for a certain dataset.

If we use a series: from 1 to 10 we can easily see that Pareto rule transforms to 40/60 rule. Pareto rule for 1..10 series

Is there a generic term for this kind of metrics and any package in Excel, R, Python, or Power Query to calculate this?

I would like to easily calculate this or similar metric for any given dataset: from company sales to words frequencies in certain text.

  • 1
    Are you asking for a general [inequality index](https://en.wikipedia.org/wiki/Income_inequality_metrics)? – COOLSerdash May 16 '20 at 14:23
  • Not exactly. They are overly complicated and doesn't express what I am looking for. If you get any series of numbers, sort it in ascending order, than you are able to calculate the "Pareto like" ratio which is different for different series: x/y, where x% yields y% of total result AND (1-y) = x. 80/20 is just a special case for some distributions and for [1..10] we have this ratio about 40/60. That simple metric which lets me see at once a property of distribution: heavy or light tailed, etc. – Andrew Anderson May 16 '20 at 17:29
  • But the metric I am looking for would certainly be a good visual inequality index applied to an income dataset. – Andrew Anderson May 16 '20 at 17:39
  • One can compare tail heaviness between two different distributions as a [limiting value](https://stats.stackexchange.com/a/274908/99274), and thereby ascertain which is heavier *in the limit*. However, classification of $n$-tuple tail heavinesses for $n$-fold distributions into a single index would be heuristic, and variable depending on what non-limiting cutoff value(s) is/are chosen for that comparison. – Carl May 17 '20 at 14:37

0 Answers0