I need to calculate the average and standard deviation of a population. However, I only have the information of sample means and standard deviations.
To be more precise, I have a vector of means $\bar{\mathbf{x}}=(\bar{x}_1, \bar{x}_2, \ldots, \bar{x}_N)$ with corresponding vector $\mathbf{n}=(n_1, n_2, \ldots, n_N)$ of sample sizes and a vector of standard deviations $\mathbf{s}=(s_1, s_2, \ldots, s_N)$. I need to compute the overall average and standard deviation.
To obtain the overall mean, I calculated it using the formula $$\bar{x}=\dfrac{\sum_{i=1}^N\bar{x}_in_i}{\sum_{i=1}^Nn_i}.$$ Whereas, for the overall standard deviation, I can compute it from the overall deviation $$Var=\frac{1}{n−1}\left(\sum_{j=1}^N(n_j−1)V_j+\sum_{j=1}^Nn_j(\bar{x}_j−\bar{x})^2\right)\,\;,$$ where $V_j$ is the deviation of the of the $j-$th sample.
Is there a way that I could compute these values in a "online" / "streaming way" (e.g., when the vectors are infinite) using only the aforementioned vectors?
Thanks in advance
Note that this question can be seem as complement to:
https://stats.stackexchange.com/questions/216047/how-does-one-go-about-determining-the-standard-deviation-of-an-entire-sample-dat
, where the above formula for deviation is presented andhttps://stats.stackexchange.com/questions/72212/updating-variance-of-a-dataset
, where the mean and deviation are update at each new observation. In my case, I do not have access to the observations.