Build an index to determine a ranking

Question

Suppose we have a dataframe similar to the following (I have more than 3 columns and all values in the dataframe are in %):

          % of municipalities with internet connection | % of households with internet access | Phenomenon 3  
Region 1                      80%                      |                  85%                 | ...
Region 2                      95%                      |                  90%                 | ...
Region 3                      90%                      |                  95%                 | ...
Region 4                      75%                      |                  80%                 | ...

I need to build an index to determine a ranking of the regions based on the observed phenomena. Anyone have any proposals? If I had only one phenomenon it would be enough to sort the regions by the %, but with more phenomena I don't know exactly how to proceed.

This might help: https://stats.stackexchange.com/questions/108418/multivariate-sorting-ranking — Darren James, Nov 18 '21 at 17:08
Region 2 has a higher municipal internet connection rate than region 3, which has a higher rate than region, which has a higher rate than region 4, while region 3 has a higher rate than region 4, while region 3 has a higher household internet connection rate than region 2, which has a higher rate than region 1, which has a higher rate than region 4. // You already knew how to do that, so what is it that you want to rank? — Dave, Nov 18 '21 at 17:08
This is a FAQ appearing under many guises. It asks for a way of ranking objects based on two or more characteristics (which might be uncertain--it doesn't matter). *It has no statistical answer,* because the tradeoffs you make between the characteristics depend on how *you* value them. — whuber, Nov 18 '21 at 17:10
@DarrenJames are you proposing the calculation of the z-score of each column? — LJG, Nov 18 '21 at 17:17
@whuber I don't understand your argument. In general if a % is high then the respective region is better placed in the ranking. — LJG, Nov 18 '21 at 17:20
@LJG Then you have your rankings...just rank based on the percentages. — Dave, Nov 18 '21 at 17:21
@Dave the answer proposed by Darren James proposes that. Step 1) and Step 2) — LJG, Nov 18 '21 at 17:21
@Dave yes but I need a overall ranking (which summarizes all variables) — LJG, Nov 18 '21 at 17:22
And that's exactly the point: any ranking of all the variables reflects tradeoffs between values of each of the variables. Although statistical thinking has informed development of theories of ranking (and utility), it gets us only as far as having principled methods to elicit a mathematically consistent set of *your subjective values.* The answer in the link offerered by @Darren is one of infinitely many solutions and thereby suffers by being *completely* arbitrary. — whuber, Nov 18 '21 at 17:42
The example in the link would involve calculating a z-score for each row (Region). As @whuber mentions, this is only one potential strategy among many and you will have to decide whether it's useful in meeting your objectives. — Darren James, Nov 18 '21 at 19:47

Build an index to determine a ranking

0 Answers0

Related