2

For example, assume people were asked to rate a movie on a scale from 1 to 10, and their rating had to be a whole number in this range (so no decimals).

Would this rating be classified as a numeric or categorical variable?

Also to take this a step further, say we asked them to rate 2 movies on the same scale, and then summed the result to produce a "result" variable. Would this be categorical or numeric?

I would be inclined to say numeric for both, but technically, there are only a set number of ratings they could give which would mean the "result" variable would have to be a number between 1 and 20 therefore, they could potentially be classified as categories?

What would you classify these variables in these scenarios as?

Nick Cox
  • 48,377
  • 8
  • 110
  • 156
Kermitty
  • 61
  • 1
  • 2
  • 1
    I imagine the answer to this question heavily depends on what you want to do with these ratings – nope Oct 09 '19 at 08:00
  • It's categorical, specifically ordinal. (If for you categorical doesn't include ordinal, rewrite that as you prefer.) Whether it's numeric in the sense of a measured variable pivots on what you mean by measurement. As for whether you can validly work with sums, there is an entire spectrum of opinion from that being out of order to it being what people do, and the heck with it. As the thread nominated as duplicate hints, many universities make heavy use of mean grades (naturally, just a step away from summing them), even while some academics within explain that you should not do this. – Nick Cox Oct 09 '19 at 08:20
  • 1
    Note that we don't object to a mean family size of 2.3 children or whatever on the grounds that 2.3 can't be a value. A mean carries the implication that it may not match any possible value. – Nick Cox Oct 09 '19 at 08:21

1 Answers1

1

Opinion variables on a 1 to 5 or 1 to 10 scale are usually considered as ordinal categorical variables. On a 1-to-10 scale, suppose someone rates one movie as 5 and and another as 10. Then it is difficult to imagine what it would mean to say that they liked one movie 'twice as much' as the other.

However, when you add two scores, you are automatically attributing numerical characteristics to the scores. If one person gives scores 5 and 6 and another person gives the same two movies scores 1 and 10, it is difficult to imagine the two people had exactly the same overall satisfaction watching the two movies. The experience watching the movie rated 1 must have been sufficiently dreadful (or tedious or offensive) to remember for a while.

The same difficulty arises with grade point averages. Grades in some courses may be almost numerical, based on an average of test scores. In other courses (perhaps an art course in making pottery) grades may depend on an instructor's subjective opinion of the pots you made during the term. It would make more sense to have 'grade point medians' than grade point averages. The reason we have GPA's may be historical. More than 70 years ago, it would not have been possible for schools to compute grade point medians for hundreds or thousands of students. Computationally, means are easier than medians.

For simple data such as 3, 1, 2, 7, 5 it may be easier to say the median is 3 than to say that the mean is 3.6. However, computation of medians requires sorting, and sorting large numbers of scores takes more computing power than adding them. Consider just the following 20 scores:

3  5  3  6  3  4  6  3  5  7  5  2  1  6  5  6 10  3  5  9

A $10 calculator can find the running total and get the mean 4.85. But finding the median to be 5 requires sorting, which such a simple calculator probably couldn't manage.

Historically, statistics has been a bit slow to adopt the use of truly appropriate procedures for ordinal data.

BruceET
  • 47,896
  • 2
  • 28
  • 76
  • In my day job, and over some decades now, I have spent a lot of time dealing with (the local equivalent) of grade point averages. I've never encountered any opinion that medians would do a better job than means, or your history that people would have preferred them but they were harder work. (Tukey was still in the 1970s showing how medians are _suited to_ hand calculation.) Means are all along presented to everyone in the system as what will emerge as the summary. That's not to rule out, although it's not the procedure locally, a system of ignoring the worst marks, which is sometimes used. – Nick Cox Oct 09 '19 at 09:37
  • Didn't mean to imply people would have preferred GPM's. Since infeasible, most people probably never even considered them. // One advantage might be that straight-A students might be willing to experiment broadening horizons if a C grade in an off-major course would have no impact on reported summary. (Ignoring worst is a worthy idea, never seriously discussed for GPA's anywhere I've taught.) – BruceET Oct 09 '19 at 10:03
  • 1
    Ethics and even the goals of education feature in any discussions. I would guess at a widespread reluctance at medians because it would be feared that even good students might game the system, registering for courses and then putting no effort into them. Not to distract from the OP's question which isn't about academic grades but I maintain that everyone knowing that grades will be averaged is part of the discussion here. I do grade hotels sometimes, not movies, and I am usually anxious to do my tiny bit to shift the mean one way or the other. – Nick Cox Oct 09 '19 at 10:09