5

At the end of every semester I have to input my students' grades into a database. Typically this means I go one by one down my list of grades from my excel spreadsheet and manually fill the grade into the db application. One way to check I have done things correctly is to compare the mean and standard deviation of the grades from my excel sheet and the application.

I realised that this would not capture the error of swapping over two grades: i.e. my excel sheet has the grades as: 4,4,4.5,6,4 and I put in 4,4.5,4,6,4.

Is there an simple numerical measure that takes into account the order of the list such that a quick comparison of that gives me confidence that all is well?

I suppose one could do some kind of Gödel numbering, but I see that as rather overkill.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
Geoff
  • 451
  • 2
  • 14
  • (i) It should be possible to automate such a process in some fashion, to reduce the risk (though such checks are always good). (ii) You'd see the serial correlation change if you swapped two different numbers - equivalently the product of the sequence and its first lag would change. – Glen_b Jun 26 '14 at 00:33

1 Answers1

5

The standard solution to this is a checksum. A checksum is a linear combination of the numbers. It is designed to change when certain kinds of errors are made. For example, the average of the grades is a checksum: it multiplies each of $n$ grades by $1/n$ and adds up the results. If any single grade is altered, the checksum is altered.

A convenient checksum to detect swaps in a list of numbers is to multiply the grades by $1,2,\ldots, n$ (in any order). Specifically, writing $(g_1, g_2, \ldots, g_n)$ for the list of grades, compute

$$g_1 + 2g_2 + \cdots + i g_i + \cdots + n g_n.$$

When $g_i$ and $g_j$ are swapped, the checksum changes by an amount

$$(i g_j + j g_i)- (i g_i + j g_j) = (i-j)(g_j - g_i).$$

It is an error to swap two different grades when $g_j-g_i\ne 0$; but since $i\ne j$, the change in the checksum must be nonzero, demonstrating that this checksum always detects a single swap. (It often will detect multiple swaps, but it is not guaranteed to do so.)

whuber
  • 281,159
  • 54
  • 637
  • 1,101