1

Furr and Bacharach (2014) present on p. 115 the following equation for the covariance between observed and true scores:

$$cov_{ot} = \frac{\sum(X_t+X_e+\bar{X}_t)(X_t+\bar{X}_t)}{N} $$

I understand how one gets from the basic principles of Classical Test Theory to that representation of the covariance between observed and true scores. However, Furr and Bacharach (2014) go on to say

Algebraically simplifying this equation, we find that the covariance between observed scores and true scores is equal to the sum of (a) the variance in true scores and (b) the covariance between true scores and error scores:

$$cov_{ot} = s^{2}_{t} + c_{et} $$

Could someone present a step-by-step explanation of how the algebraic simplification occurs? I’m looking for something that could hypothetically be explained to a person with only a basic high school knowledge of algebra.

Furr, R. M., & Bacharach, V. R. (2014). Psychometrics: An introduction. Los Angeles, CA: Sage.

1 Answers1

1

There is a small mistake in the formula you wrote here. It should be:

$$cov_{ot} = \frac{\sum(X_t+X_e-\bar{X}_t)(X_t-\bar{X}_t)}{N} $$

1) the long way

First, by distributing the terms within parentheses and group them together, we will get this:

$$cov_{ot} = \frac{\sum(X_t^2-2\bar{X}_tX_t+\bar{X}_t^2)}{N} + \frac{\sum(X_eX_t-X_e\bar{X}_t)}{N} \textrm{ (1) }$$

Let's call the first part of the equation $a$ and the second part of it $b$. $$ cov_{ot} = a +b $$

$$ a = \frac{\sum(X_t^2-2\bar{X}_tX_t+\bar{X}_t^2)}{N} $$ $$b=\frac{\sum(X_eX_t-X_e\bar{X}_t)}{N}$$

First, look at the first grouped terms $a$. Distribute the $\sum$ into the paranthesis and get the following:

$$ =\frac{\sum X_t^2}{N} - \frac{-2\bar{X}_t\sum X_t}{N} + \frac{\sum \bar{X}_t^2}{N} $$

Note that $\bar{X}_t$ is a constant so that we use there the distributivity identity of summation and put it in the front of the summation sign. Check for more details from the wiki.

Second note that the mean of a variable equals the summation of the all terms divided by the number of terms. So, it also means that the summation of all terms equals the multiplication of the number of all terms with the mean. In other words:

$$ \bar{X}_t = \frac{\sum(X_t)}{N} \textrm{ and } \sum(X_t) = N\bar{X}_t \textrm{ (2) } $$

So, from $(2)$, we will get the following arrangement:

$$ =\frac{\sum X_t^2}{N} - \frac{-2N\bar{X}_t\bar{X}_t}{N} + \frac{N\bar{X}_t ^2}{N} $$

$$ = \frac{\sum X_t^2}{N} - \frac{-2N\bar{X}_t^2}{N} + \frac{N\bar{X}_t ^2}{N} $$

$$= \frac{\sum X_t^2}{N} - \bar{X}_t ^2 $$

Recall that short-cut variance formula is: see for detail:

$$var(x) = \frac{\sum X_i^2}{N} - \bar{X}_i ^2 \textrm{ (3)}$$

So, from $(3)$, we will get the variance of the true score:

$$ a= s^{2}_{t} = \frac{\sum X_t^2}{N} - \bar{X}_t ^2 $$

The second part of the equation $(1)$ is more straightforward:

$$b=\frac{\sum(X_eX_t-X_e\bar{X}_t)}{N}$$

Distribute the $\sum$ and use identity $(2)$, we can write this as we did for the first part:

$$=\frac{\sum(X_eX_t)}{N} - \bar{X}_t \frac{\sum(X_e)}{N}$$

$$=\frac{\sum(X_eX_t)}{N} - \bar{X}_t \bar{X}_e $$

Recall that short-cut covariance formula is: see for detail:

$$cov(X_i,Y_i) = \frac{\sum(X_iY_i)}{N} - \bar{X}_i\bar{Y}_i \textrm{ (4)} $$

So, from $(4)$, we can get:

$$ b = c_{et} = \frac{\sum(X_eX_t)}{N} - \bar{X}_t \bar{X}_e $$

Taken togeter, we show that:

$$ cov_{ot} = a +b $$

$$ cov_{ot} = s^{2}_{t} + c_{et}$$

2) The short way

We can also show this relationship by using the covariance of linear combinations properties, see for the details :

Recall that $$ Cov(X,X) = Var(X)$$ $$ Cov(X+Z,Y) = Cov(X,Y)+Cov(Z,Y)$$

So, the observed variable is $X_o = X_t+X_e$ and covariance between the observed variable, and the true score is:

$$ Cov(X_t+X_e,X_t) = Cov(X_t,X_t)+ Cov(X_e,X_t) $$ $$ = Var(X_t)+ Cov(X_e,X_t)$$

That is to say, the covariance between observed scores and true scores is equal to the sum of (a) the variance in true scores and (b) the covariance between true scores and error scores.

mustafaakben
  • 126
  • 4