1

In this introduction of Rasch Model, it says "the Rasch model emphasizes the primacy of the requirements for fundamental measurement". What exactly does "requirements for fundamental measurement" mean?

According to this site, fundamental measurement is merely direct measurement such as weight and height. What kind of requirements was being referred to then?

What exactly is the difference between "the fundamental measurement of extensive quantities" (as also mentioned in this site) and "the fundamental measurement of non-extensive quantities"?

Aqqqq
  • 529
  • 4
  • 12

1 Answers1

2

This is convoluted language. And I don't agree with the claims put forward.

In broad strokes, RASCH is simpler than 3PL. The premise is similar: you have a bunch of questions which you believe assess a trait, you have a sample of people answering the questions, and each person and question varies in its likelihood of providing a "positive" response. Rasch inputs the "difficulty" (of a question) and the "ability" (of a participant) as offsets to the logistic curve.

Obviously this is a much more constrained setting. We assume the odds ratio for correct response varies constantly between each individual and each question as free parameters, holding the other fixed respectively. If you are lucky enough to design an instrument which meets these constraints, the sum scale is immediately available as a validated instrument. You don't get that with 3PL.

Take as an example a math aptitude test using children of a variety of ages. Your test includes questions on arithmetic, geometry, and algebra, but includes some children who haven't been trained in algebra, and some who have been. Your algebra questions need 3PL to provide a better scale because merely saying the questions are "difficult" is not enough, your most apt student with training in only arithmetic and algebra will still be discredited too much for their ability while students with complete training will be overly credited.

I don't agree with the claims because the suggestion that you can identify and adjust these issues in 3PL is logistically very difficult and unlikely to provide a scale with any external validity: you are using the same data set to both generate hypotheses (about differential item functioning, essentially) and confirm hypotheses (about construction of a validated scale). While with extensive fudging you can constructed a weighted scale from 3PL that, in that particular sample, will correctly rank subjects in terms of their ability, the external validity will be highly variable.

AdamO
  • 52,330
  • 5
  • 104
  • 209