Calculate mean of ordinal variable

Question

I've read in a number of places that calculating the mean of an ordinal variable is inappropriate. I'm trying to get an intuition for why it might be inappropriate. I think it is because, in general, an ordinal variable is not normally distributed and so calculating the mean will give an inaccurate representation. Could somebody give more detailed reasoning for why calculating the mean of an ordinal variable might be inappropriate?

To calculate a mean, you first need a sum. For a sum to be meaningful, you need that 4+2 to be the same as 3+3; equivalently, you need 4-3 = 3-2 = 2-1. With ordinal data - even when its categories are labelled "1","2","3","4" - this is (quite explicitly) not necessarily the case. — Glen_b, Aug 16 '13 at 09:17
And why would the median be more appropriate than the arithmetic mean? — , Feb 26 '16 at 20:22

Nick Cox · Answer 1 · 2019-10-03T13:35:38.597

A short answer is that this is contentious. Contrary to the advice you mention, people in many fields do take means of ordinal scales and are often happy that means do what they want. Grade-point averages or the equivalent in many educational systems are one example.

However, ordinal data not being normally distributed is not a valid reason, because the mean is

widely used for non-normal distributions
well-defined mathematically for very many non-normal distributions, except in some pathological cases.

It may not be a good idea to use the mean in practice if data are definitely not normally distributed, but that's different.

A stronger reason for not using the mean with ordinal data is that its value depends on conventions on coding. Numerical codes such as 1, 2, 3, 4 are usually just chosen for simplicity or convenience, but in principle they could equally well be 1, 23, 456, 7890 as far as corresponding to a defined order as concerned. Taking the mean in either case would involve taking those conventions literally (namely, as if the numbers were not arbitrary, but justifiable), and there are no rigorous grounds for doing that. You need an interval scale in which equal differences between values can be taken literally to justify taking means. That I take to be the main argument, but as already indicated people often ignore it and deliberately, because they find means useful, whatever measurement theorists say.

Here is an extra example. Often people are asked to choose one of "strongly disagree" ... "strongly agree" and (depending partly on what the software wants) researchers code that as 1 .. 5 or 0 .. 4 or whatever they want, or declare it as an ordered factor (or whatever term the software uses). Here the coding is arbitrary and hidden from the people who answer the question.

But often also people are asked (say) on a scale of 1 to 5, how do you rate something? Examples abound: websites, sports, other kinds of competitions and indeed education. Here people are being shown a scale and being asked to use it. It is widely understood that non-integers make sense, but you are just being allowed to use integers as a convention. Is this ordinal scale? Some say yes, some say no. Otherwise put, part of the problem is that what is ordinal scale is itself a fuzzy or debated area.

Consider again grades for academic work, say E to A. Often such grades are also treated numerically, say as 1 to 5, and routinely people calculate averages for students, courses, schools, etc. and do further analyses of such data. While it remains true that any mapping to numeric scores is arbitrary but acceptable so long as it preserves order, nevertheless in practice people assigning and receiving the grades know that scores have numeric equivalents and know that grades will be averaged.

One pragmatic reason for using means is that medians and modes are often poor summaries of the information in the data. Suppose you have a scale running from strongly disagree to strongly agree and for convenience code those points 1 to 5. Now imagine one sample coded 1, 1, 2, 2, 2 and another 1, 2, 2, 4, 5. Now raise your hands if you think that median and mode are the only justifiable summaries because it's an ordinal scale. Now raise your hands if you find the mean useful too, regardless of whether sums are well defined, etc.

Naturally, the mean would be a hypersensitive summary if the codes were the squares or cubes of 1 to 5, say, and that might not be what you want. (If your aim is to identify high-fliers quickly it might be exactly what you want!) But that's precisely why conventional coding with successive integer codes is a practical choice, because it often works quite well in practice. That is not an argument which carries any weight with measurement theorists, nor should it, but data analysts should be interested in producing information-rich summaries.

I agree with anyone who says: use the entire distribution of grade frequencies, but that is not the point at issue.

Great answer and pragmatism is important, but I would add one note of caution. A good reason for only using formally established methods is that you get access to estimations of certainty &c. For example if we have two GPAs, say 4.53 and 4.34, we may want to know if one is "significantly" better than the other. But due to the lack in formality in the averaging of the grades, we don't get things like confidence intervals &c. — Stephen McAteer, Jan 27 '16 at 02:13
@StephenMcAteer I see your point in terms of the methods taught in a typical introductory text or course. But if that were the desire, bootstrapping has provided a technology allowing confidence intervals for almost 40 years now. — Nick Cox, Jan 27 '16 at 08:53

score 4 · Answer 2 · edited Aug 16 '13 at 09:05

4

Suppose we take ordinal values, e.g. 1 for strongly disagree, 2 for disagree, 3 for agree, and 4 for strongly agree. If four people give the responses 1,2,3 and 4, then what would be the mean? It is (1+2+3+4)/4=2.50.

How should that be interpreted, when the four person average response is "disagree or agree"? That's why we should not use mean for ordinal data.

edited Aug 16 '13 at 09:05

Nick Cox

48,377
8
110
156

answered Aug 16 '13 at 08:59

SAAN

531
5
16

3

Playing devil's advocate a little, in this example, I would interpret 2.5 as being half-way between 2, "disagree", and 3, "agree". This makes sense as an average given that we have "strongly disagree" vs "strongly agree", and "disagree" vs "agree". – TooTone Aug 16 '13 at 09:22
1

Agree mean of 2.5 in this context still makes sense to me - halfway between disagree and agree, or in other words, neutral. – luciano Aug 16 '13 at 09:29
3

I think Azeem needs a stronger example. You could object to 2.5 as the average of 1, 2, 3, 4 children per family on the same grounds, how is that to be interpreted as it is not one of the defined values. That raises different issues. – Nick Cox Aug 16 '13 at 09:30
@Azeem I like your approach of providing a simple counter-example. How about an example where your categories are quantifying something, e.g. weights of children, number of marbles in a bag, e.g. something like small = "0-1", medium="2-4", large = "5-10", xl = ">10"? – TooTone Aug 16 '13 at 09:45
@luciano How 2.5 mean make sense? You say halfway between disagree and agree (distance is 0.5) and same half way of strongly agree and strongly disagree (distance is 1.5). Ever you seen this? and how it possible one mean has two interpretation. Actually 2.5 is not representative part of your data, your data jumps integer to integer ranges 1 to 4. – SAAN Aug 16 '13 at 10:18
@TooTone In fact Qualitative data based on nominal and ordinal classification and example you provide belongs to interval or Liket scale you are confusing your self in ordinal and Liket scale data. – SAAN Aug 16 '13 at 10:21
Azeem: I disagree. @Tootone has a good example; his example and yours are both ordinal. What Likert [NB] scales are is best left as a different issue. – Nick Cox Aug 16 '13 at 10:38
@NickCox "Weight of children" is quantitative measure and mean is appropriate for that, "number of marbles in a bag" I dont understand how it relates with the discussion and "something like small="0-1"..." is assumed infinite values between "0-1", "2-4" etc. – SAAN Aug 16 '13 at 10:46
I didn't say anything about weight. Please look again: my example is children per family, who you count. Your disagreement is with @TooTone, who makes a perfectly valid point that you could split a numeric scale into ordered (ordinal) categories. – Nick Cox Aug 16 '13 at 10:54
@NickCox Your answer is nice I cant say any thing about your answer, Yes my disagreement is with TooTone because in my knowledge the point (split a numeric scale) relates to mixed (ordinal and interval) not pure ordinal. – SAAN Aug 16 '13 at 11:00
I don't think this is difficult. You could ask someone to report age category for people in a family, in years, 0-4, 5-9, etc. Sure, underlying that is age in integer years, and underlying that in turn is a continuous scale. But what the researcher sees and can analyse are in this example just ordinal categories. That's what you name, what the researcher has, not what it might be. – Nick Cox Aug 16 '13 at 11:05
@NickCox Can we discuss it in chat room? – SAAN Aug 16 '13 at 11:09
Sorry, no, I personally don't use the chat room. I think it's best that (a) the whole of a discussion is accessible in one place for every one interested to see easily later (b) people edit their answers to spell out extra points they want to make or to explain their views more clearly. – Nick Cox Aug 16 '13 at 11:14
@NickCox ok, In your last example "people in a family" has meaning full difference this characteristic belong to interval and also see the Glen_b comment on the question. – SAAN Aug 16 '13 at 11:18
I agree; but my point was that the mean not being a possible value is not the key objection, as that is easily possible with variables you don't call ordinal. (Also, the mean could be a possible value with ordinal data, as with 1,2,2,3.) – Nick Cox Aug 16 '13 at 11:21
But in qualitative data (nominal and ordinal) we use percentages or ratio and in quantitative (interval and ratio) used averages. Mean is defined when sum divisible by 4, in fact its rare in all possible combination and we cant say this time defined and this time not. (we are not violating basic) – SAAN Aug 16 '13 at 11:31
2

I think you can strengthen your answer and I encourage you do that. "because the mean might be an undefined value" is not a strong argument here, logically or psychologically, and does not focus on the deeper issue of whether equal differences really mean equal differences. – Nick Cox Aug 16 '13 at 11:39
I think we are going on other track, @NickCox I need explanation why you are calling interval split data on real line as a ordinal data? – SAAN Aug 16 '13 at 11:52
I can't add much to @NickCox's exposition. I was trying to think of an example where the categories are obviously skewed, so taking the mean of equally spaced category labels does not correspond to what the categories are labelling. "Splitting a numeric scale" as Nick said seemed a clear way of doing it. But you could also come up with something more subjective. E.g. weights of animals: 1="microscopic", 2="tiny", etc. – TooTone Aug 16 '13 at 12:43
@TooTone "1="microscopic",2="tiny"" than interpret 1.5 – SAAN Aug 16 '13 at 12:48
1

I don't know how I can make it any clearer, but (e.g.) "0-4", "5-19", "20-114" are ordered (ordinal) in that there is only _one_ natural order to those measurements (short of reversal). If you want to call them other things too, that's fine by me. – Nick Cox Aug 16 '13 at 15:17

score 2 · Answer 3 · edited Aug 16 '13 at 09:32

I totally agree with @Azeem. But just to drive this point home let me elaborate a bit further.

Let's say you have ordinal data like in the example from @Azeem, where your scale ranges from 1 through 4. And let's also say you have a couple of people rating something (like Ice Cream) on this scale. Imagine that you get the following results:

Person A said 4
Person B said 3
Person C said 1
Person D said 2

When you want to interpret the results, you can conclude something to the extent of:

Person A liked Ice Cream more than Person B
Person D liked Ice Cream more than Person C

However, you don't know anything about the intervals between the ratings. Is the difference between 1 and 2 the same as that between 3 and 4? Does a rating of 4 really mean that the person likes Ice Cream 4 times more than someone who rates it as 1? And so on... When you compute the arithmetic mean, you treat the numbers as if the differences between them were equal. But that's a pretty strong assumption with ordinal data and you would have to justify it.

I edited out the reference to the answer above. Answers can change order and in fact the answer that was above is at this moment below, and that can change. So cross-refer to posters, not position. — Nick Cox, Aug 16 '13 at 09:28

score 0 · Answer 4 · edited Jan 01 '16 at 10:34

0

I agree with the concept that arithmetic mean cannot be truly justified in ordinal scale data. Instead of calculating mean we can use mode or median in such situations which can give us more meaningful interpretation of our results.

edited Jan 01 '16 at 10:34

Nick Cox

48,377
8
110
156

answered Jan 01 '16 at 07:28

ayaz

1
1

This doesn't address the question of **why** it might be inappropriate. – Nick Cox Jan 01 '16 at 10:34

Calculate mean of ordinal variable

4 Answers4

Linked

Related