4

Situation: A three part ability test with multiple-choice, essay, and oral components. I have scores for each of 110 test takers on each part and overall. The three parts are unequally weighted in computing the overall score. There is reason to think that the three parts measure different abilities and are not highly correlated.

Available information: internal-consistency reliability of the M/C test (.82), and interrater-reliability of the grading of the essay and oral components (both near .95). The two parts were graded by different graders.

Desired goal: Estimate test-retest reliability of overall score

Would the low reliability of .82 drive the reliability of the total score, or would the reliability of the total score be closer to a weighted average of the three reliabilities? Is there a formula for combining these reliabilities to get an estimate of the test reliability of the overall score? What else would I need to know or what assumptions would I need to make to use the formula? If this is a complex matter, can you refer me to a relevant textbook or journal article?

I found these two related questions on this site but the answers were not helpful to me: Reliability of Composite Variable Made of 4 Measures? How to assess the reliability of a composite scores?

I guess I could do a Monte Carlo study. My gut feeling is that the low reliability will drive the overall reliability. But is there an analytic approach that would give a more definitive answer?

Joel W.
  • 3,096
  • 3
  • 31
  • 45

1 Answers1

1

Please find below the link to the Chapter on reliability that discusses many of your issues. For the specific issue of test-retest reliability see p.27-28, and reliability of a composite test is covered on page 32. This Chapter is written by William Revelle, a reputable psychometrician, hence the source is credible.

https://www.personality-project.org/r/book/Chapter7.pdf

PsychometStats
  • 2,147
  • 1
  • 11
  • 27
  • Are you referring to formula 7.38? (Nice chapter!) – Joel W. Oct 22 '19 at 23:06
  • How do you understand the formula to work if the subtests are independent and standardized before combination? (Clearly the reliability of the sum is less than the reliability of the least reliable subjest.) What is the effect of the subtest and whole test variances in the situation I presented? – Joel W. Oct 22 '19 at 23:34
  • @JoelW. yes that's right, equation 7.38 is composite reliability. To calculate it you would simply need reliability of each subtest (e.g. alpha), variance of each subset, and the total test variance. Then simply plug in the values in the formula and here you have composite reliability of your test. If you found this answer useful, I would be appreciative if you could accept it please – PsychometStats Oct 22 '19 at 23:36
  • @JoelW. ok, there are two ways in which you can proceed, I will write them down for you in a moment – PsychometStats Oct 22 '19 at 23:41
  • @JoelW. 1. I assume that M/C test is one of three subtests. Then, you can use reliability of each component to get total reliability. That is, internal consistency of the M/C (.82), inter-rater reliability for grading (about .95), and for oral (about .95). You also have the remaining parts, including variance of each subtest and total variance. This method is based in my opinion on a less stringent assumption of combining different types of reliability into composite reliability. I would not worry about standardisation, because it is common practice in reliability assessment. – PsychometStats Oct 22 '19 at 23:50
  • @JoelW. neither would I worry too much that the subtests are independent, because after all you are calculating COMPOSITE reliability. However, I can see how independence may worry you. In which can, a common practice would be to proceed, but aknowledge the fact of independence. This will help you to be pre-emptive, before potential critisism arises. – PsychometStats Oct 22 '19 at 23:52
  • @JoelW. 2. In my view a bigger problem with composite reliability is including two different types of reliability (interrater and internal consistency). You can still legitimately do it as mentioned in option 1. However, if you wish to be extra stringent, I would base my composite reliability on two similar indices, i.e inter-rater reliability for oral test and for essay grading. This could give you two extra benefits. First, since each of those have high reliability (around .95), you would expect composite reliability to be higher with only these two indices, than – PsychometStats Oct 22 '19 at 23:57
  • @JoelW. by incorporating a different type of reliability into the process (internal consistency for M/C), especially as its alpha is .82. So I would expect, lower composite reliability with three elements. But please check by all means. Another advantage would be that your composite reliability is more methodologically stringent, as each subtest was measured by using the same and thus comparable methodology (inter-rater agreement). In general, this is my take on this. To be honest, test development jhas some room for interpretation, as long as you can justify either of your choices. – PsychometStats Oct 23 '19 at 00:00
  • From inspection of formula 7.38 and trying a few sets of 3 subtest reliability values, it seems that formula 7.38 is mathematically equivalent to the mean of the 3 reliabilities. Does that seem correct to you? – Joel W. Oct 23 '19 at 00:22
  • @JoelW. if you like, write values for each component here, I can double check it for you in case you want another pair of eyes to look at this. Generally, If you look at the example from 7.38 where they have three sub-test reliabilities: .857, .857, and .823, the composite reliability is .90. Also by looking at the formula, It does not look like the mean of three reliabilities to me. What I would guess is that given reliabilities of your three sub-tests, I would expect the reliability to be probably >.80. However, I don't know the subtests' variance or total variance for that matter to be sur – PsychometStats Oct 23 '19 at 00:35
  • The example in chapter 7 did not involve independent subtests. if each σ2i = 1 and the I subtests are independent, then σ2X will be equal to i. Given that, the reliability predicted by the formula and the mean reliability are the same. Do you agree? – Joel W. Oct 23 '19 at 01:39
  • @JoelW. yes that makes sense – PsychometStats Oct 23 '19 at 02:09