2

I've got several variables in a dataset of which I'm not really sure what type of data they are: nominal or ordinal.

Context: the variables are part of a dataset in which each case a student. Of those students 100 were part of an experimental group, while 51 were part of the control group. The analysis I want to perform is to test the hypothesis that the experimental group performed better then the control group. Therefore I need to establish the correct types, nominal or ordinal, to determine which statistical tests I might run to test the hypothesis.

The first variable can take three possible values:

0 (indicating 'wrong answer given on question X')
1 (indicating 'partly correct answer given on question X')
2 (indicating 'correct answer given on question X')

My doubts here: I believe we can only define a variable as ordinal when a certain order can be established. I think we can: 0<1<2. Is this correct reasoning?

Second variable holds two possible values:

0 (indicating 'wrong answer given on question Z')
1 (indicating 'correct answer given on question Z')

My doubts on this one: if I recall correctly, variables with only 0 and 1 are usually nominal. But aren't 'right answer' and 'wrong answer' not possible ordinal values, since 0<1?

So my question here is: for each of those variables, should I tread them as nominal or ordinal?

ttnphns
  • 51,648
  • 40
  • 253
  • 462
  • 1
    In general, how you treat a variable depends on the analysis you are performing. If you could provide more of that contextual information you would likely get answers that are more useful. BTW, what distinction are you making between "on" and "in" in the second variable? Both values appear to assert a wrong answer was given. – whuber Jun 23 '15 at 18:12
  • @whuber hi, thanks for your quick response! I've added some context to the original question! Any thoughts on that matter? – Dennis Hunink Jun 23 '15 at 18:16
  • 1
    (1) Your second variable now has only *one* possible value! (2) A more fundamental issue with this analysis concerns how you intend to quantify a "better" performance. Resolving that issue would likely answer any further questions about how to encode the variables for analysis. – whuber Jun 23 '15 at 18:19
  • @whuber (1) you're right, 0 is not a real value. (2) by better performance I mean the experimental group would score higher then the control group. And by higher I mean giving more correct answers (second variable) or giving more partly correct answers (first variable) – Dennis Hunink Jun 23 '15 at 18:22
  • @DennisHunink: read up a bit on [measurement levels](http://www.spss-tutorials.com/measurement-levels/). It's not about 0 > 1 > 2. What matters is whether "what 0 represents" > "what 1 represents". – RubenGeert Jun 23 '15 at 18:49
  • @RubenGeert thanks for the link. Gave some meaningful insights. In essence my question would come down to: "does wrong or right are values that do or do not have a indisputable order? If so, their ordinal". Right? – Dennis Hunink Jun 23 '15 at 18:55
  • For yor `X`, it is ordinal (you might treat it as interval, too, but it 'll be a stretch). Binary variable `Z` is _binary_ (or dichotomous). Depending on the research and analysis context it can be taken for nominal or for ordinal ([see, for example](http://stats.stackexchange.com/a/116859/3277)). Or all the same. As an independent variable in regressional models it produces the same result when is considered a categorical factor or as "continuous" covariate. – ttnphns Jun 23 '15 at 19:51
  • @DennisHunink: "nominal" or "ordinal" are meaningless with regard to dichtomous variables. Since there are only two valid values, there is only one interval between them, hence they are metric by definition. This is also the reason why nominal variables can be used in regression after [dummy coding](http://www.spss-tutorials.com/creating-dummy-variables-in-spss/) them: the resulting dichotomous variables are metric and can thus be safely entered into the regression model. – RubenGeert Jun 23 '15 at 19:55
  • 1
    One more note. SPSS suggests to assign the type - nominal, ordinal or scale. But few SPSS procedures make use of this info, the majority of procedures don't need it. – ttnphns Jun 23 '15 at 19:55
  • @ttnphns thanks for your comment! After reading the link provided by RubenGeert your thoughts are truly answering my question. I believe I've learned more in the past hours reading all the comments then I did during all the ours of attending classes at the university :p. But that's a side node. Please post your answer as a real answer, so I can mark this as solved! – Dennis Hunink Jun 23 '15 at 19:56
  • @RubenGeert I get that now, thanks so much! But if I understand correctly: treading dichotomous variables as ordinal, since they meen 'answered correctly' or 'answered wrong' is suitable for the described situation. In addition, if I would like to perform regression analysis I could recode the variables that hold three categories into dummies. – Dennis Hunink Jun 23 '15 at 19:59
  • While most of the built-in procedures in SPSS Statistics do not use the measurement level, leaving it to the user to classify variables as factors or covariates, most of the many R-based extension commands do automatically use the measurement level to create factors appropriately. – JKP Jun 24 '15 at 12:31

0 Answers0