The variance of two running variances

Question

Having 2 sets, and only this data for each one, the running variance, the sum, the running mean and the count. How can I get the merged variance of the 2 sets?

EDIT:

The values of the sets are being updated each time with new ocurrences this ocurrences are not being stored.

The values of the sets are not equal.

I need to merge this 2 sets and get the new variance of this merged set.

EDIT 2:

I think that what I need is the pooled variance, am I correct?

In java would be something like this

Double variance = (((firstAggregate.count - 1) * firstAggregate.variance) + ((secondAggregate.count - 1) * secondAggregate.variance)) / ((otherAggregate.count + secondAggregate.count) - 2);

Could you be more specific, please add more details to your question — Pluviophile, Jun 10 '20 at 08:21

4.Pi.n · Accepted Answer · 2020-06-12T15:54:54.737

1

There's no need to merge two sets and compute the variance, it's a time consuming task you can compute the variance for each of them separately, then update the total variance. Updating variance could be done by using an update formula.

$T_1,_m = \sum_{i=1}^{m} x_i \\$

$S_1,_m = \sum_{i=1}^{m} (x_i - \frac{1}{m}*T_1,_m)^2 \\$

The equation discussed at a pairwise algorithm for computing sample variances paper.

Update

Parallel algorithm

edited Jun 12 '20 at 15:54

answered Jun 11 '20 at 01:37

4.Pi.n

156
1
8

1

thank you @m-zayan that pairwise algorithm is what in wikipedia is called the parallel algorithm, and is what I need, if you could add it to your answer so it can be more complete :) https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Parallel_algorithm – Bentipe Jun 12 '20 at 09:29
you are welcome, note the edit of the T, T is the sum over all elements in the set not (mean). I have added it, thanks. – 4.Pi.n Jun 12 '20 at 15:33

The variance of two running variances

1 Answers1

Update