Exception for sum of deviations from mean being 0

Question

I was reading here

the sum of the deviations about the mean will be 0, except for possible rounding.

Could anyone explain me the what does it mean? I know about sum of deviations from mean being zero but what about this except for possible rounding?

Glen_b · Accepted Answer · 2013-07-23T23:06:26.050

9

When a mean is computed, it's not computed to infinite precision. As a result, the computed sum of deviations around a mean can be a little different from zero.

We can see this, for example, in R, like so:

 x <- rnorm(1000)  # generates 1000 standard normal random numbers, puts them in x
 d <- x - mean(x)  # compute the deviations from the mean and put them in d
 sum(d)            # add the deviations
[1] 2.026851e-14

Now $2 \times 10^{-14}$ is very small... but it isn't exactly zero.

If you want to investigate in detail how finite precision computation is different from algebra, this is a handy resource.

If you compute a mean by hand and round your values off to say 3 decimal places, you'll see the same thing - frequently the sum of deviations about the mean is slightly different from zero.

edited Jul 23 '13 at 23:06

answered Jul 23 '13 at 09:53

Glen_b

257,508
32
553
939

(I think it is worth adding at the end there again "... due to infinitesimal computational error.") – russellpierce Jul 23 '13 at 10:07
@rpierce with calculations to 3dp I'm not sure that's really going to qualify as infinitesimal. – Glen_b Jul 30 '13 at 08:21
Sorry I was unclear. I was referring back to the computer calculated case, not to the hand-rounded case. Regardless, perhaps I should be more careful whether rounding error is infinitesimal for practical purposes depends on the scale of the actual differences, scale of the variances, and precision of the measurement. – russellpierce Jul 30 '13 at 10:55
1

Even so, it is worth noting that the average absolute error in calculated SS when rounding to 3 decimal places from 20 samples of a normally distributed variable is on the order of .002*$\sigma$ (A simulation result; some restrictions obviously apply, e.g. $SS_{error}$ obviously increases with N, but the quantity of typical interest, MSE, decreases). – russellpierce Jul 30 '13 at 11:10
... and the extent to which the sum of deviation scores (the original use case) is, of course, much smaller, $\sqrt{.002*\sigma}$. – russellpierce Jul 30 '13 at 11:19

score 8 · Answer 2 · edited Jul 24 '13 at 14:08

I want to add something to the previous answer, with which I completely agree.

It happens that I am working on implementing a statistical library in Java and I use as a reference point the computed values from R. A few days ago I studied algorithms for implementing mean and variance. And what I found is that the C code which computes the mean in R (mean in R calls an internal function which is written in C) uses a simple technique to compensate for loss caused by rounding. And there I found exactly what you searched for.

I will show a simplified code, since the original C code uses macros and unnecessary complicated stuff:

function mean(double[] x) {
  double s = 0.;
  double n = length(x);
  for (int i = 0; i < n; i++) s += x[i];
  s /= n;
  double t = 0;
  for (int i = 0; i < n; i++) t += x[i] - s;
  s += t / n;
  return s;
}

In the previous code the variable t contains the sum of deviations about the mean. If that statement is interpreted strictly from a mathematical point of view, it should be 0. But when it comes to computation the same statement should be redefined as "t contains the sum of deviations of the computed mean with finite precision".

The idea of the compensation is very intuitive when working on large values with small variation. In that case s might lose precision (by losing the last bits from the floating point representation) and computing t gives a better chance of not doing so since the values of x[i] and the computed s are comparable.

Exception for sum of deviations from mean being 0

2 Answers2