How should I determine an average from a set of averages

Question

I have a set of lines. I measure their length each 5 times, and then take the mean of their lengths (as I measure them). I also take the standard deviation.

I want to know the 'average' line length. Certainly I can take the mean of the mean line lengths for each line, but how can I best weight these mean values to encompass the idea that some measurements are better (ie have smaller stdevs) than others? (Should I do this?)

Further, how do I then describe the confidence I have in my final value?

The lines are generated from experiment. The measurements of the lines are taken from images. Consequently, each measurement should be identical, but is performed repeatedly because the measurement is not so accurate.

Tim · Accepted Answer · 2015-02-27T09:24:53.453

3

If you want to include information about error you can use weighted mean, where weights are based on individual variances ($\sigma^2_i$):

$$ \overline{x} = \frac{\sum^N_{i=1} x_i \sigma_i^{-2}}{ \sum^N_{i=1} \sigma_i^{-2}} $$

if your sample sizes differ you could use reversed (since you want results with less error to have greater weight on the result) standard errors instead of variances, since they include information on sample size ($n_i$):

$$ \overline{x} = \frac{\sum^N_{i=1} x_i (\sigma_i/\sqrt{n_i})^{-1}}{ \sum^N_{i=1} (\sigma_i/\sqrt{n_i})^{-1}} $$

Generally, if the only thing you want to do is to weight your means then you probably do not need anything more sophisticated.

edited Feb 27 '15 at 09:24

answered Feb 27 '15 at 09:07

Tim

108,699
20
212
390

In the second approach, what happens if the standard error is zero (because the standard deviation is zero too)? – Alejandro Benito-Santos Nov 21 '19 at 15:51
@ale0xB it would be a degenerate case and it would fail. – Tim Nov 21 '19 at 17:01

Ben Kuhn · Answer 2 · 2015-02-27T07:29:13.393

What do you mean by "average"? Do you mean that your model for measured line length is $L = L_{avg} + \epsilon^{line}_{i} + \epsilon^{measure}_{i,j}$ (basically, there's some variance in line length and some variance in measurement on top of that) and you're trying to estimate $L_{avg}$ given a number of lines and measurements?

If so, then this is a simple case of a multilevel model. For instance, you could fit it in R with glmer(length ~ (1|line)) and extract the global average from the global intercept term. (And the length of each line from that line's intercept term.)

In fact, it's such a simple case that there might be a simpler way to estimate it; perhaps someone more familiar with multilevel models can correct me.

How should I determine an average from a set of averages

2 Answers2

Linked