multivariate sorting / ranking

Question

I have data for 1000 students' performance over 10 different tests on a scale of 0 -100 (a 1000 rows X 10 col matrix). I calculated the mean score and the associated coefficient of variation for each student. Now I wish to sort / rank students who consistently perform better, that is, have a high mean score and low coeff. of variation. Can anyone suggest a ranking scheme based on two objective variables (mean score and coeff. of variation). Thanks.

score 0 · Accepted Answer · answered Jul 18 '14 at 14:52

There are numerous ways to do this. I would translate the score into a standard normal calculation. So you could think of it as the probability that each student could score higher than X*. To a certain degree X is arbitrary, but I would set X to the mean of all test scores. Since you don't have listed specific data or a specific language I'm going to just write out the steps you would need to preform in any programming language.

   Step 1) Calculate X = mean of all tests scores
   Step 2) Calculate z = (each student's average - X)/(std dev for each student)
   Step 3) Calculate cumulative normal distribution(z)

*This would be the probability that a student scores higher than the average student on a test assuming that test scores are normally distributed. The validity of that assumption is up for some debate, but if you are simply trying to rank them in order, this would give you the correct order. If you are trying to say student X is z% better than student Y then the assumption of the underlying distribution matters.

Thanks. Can I set up the "X" to a mean = 100, std. dev. = 0 in Step 1 ? It would be like having an imaginary student who scores a perfect 100 in all the tests. — user3816627, Jul 18 '14 at 15:37
First of all, the X is a constant so it will always have a std of 0 (even when X is define as the average of all the exams, X is still deterministic). X is arbitrary and can be whatever you want and will not change your results (at least of the ranks, the % that one is better than another will change). I like the mean (or median) because you will get some values about X and some below. If you set x to 100 all average are going to be below (or equal to) X. The normal distribution has both positive and negative values. — Eric, Jul 18 '14 at 21:05
The problem with this proposal lies in its arbitrary nature. See https://stats.stackexchange.com/questions/9358. — whuber, Nov 18 '21 at 17:11

multivariate sorting / ranking

1 Answers1

Linked