Calculating a weighted mean and weighted SD from multiple mean/SD/different sample sizes?

Question

QUESTION EDITED: I have 10 different patients, who have undergone heart procedures and we have collected some measurements of cardiac electrical activity from the same region of all ten hearts. Due to a number of reasons, the data collected from each region comprises a different number of points/measurements. This has been analysed by proprietary software and we have been given only three values for each patient. The mean (electrical conduction velocity in cm/sec), the SD and the number of points that were used by the software to calculate this mean/SD (n) for each patient.

Mean
SD
n

I have NOT been provided with the individual values associated with each point for each patient, only the those three numbers.

The number of points for each patient varies quite a bit - some are 47, some are 17 so as an example, the first two patients have the following:

Patient 1

Mean = 2.5cm/s
SD = 0.8
n = 47

Patient 2

Mean = 3.5cm/s
SD = 1.5
n = 17

and so on for all 10 patients

Is there a way I can find (a) MEAN of the MEANS and (b)appropriate SD for the mean of the means WITHOUT needing the individual values of each point?

I think I understand how to find a weighted mean:

Multiple each mean by n of that patient/cumulative n of all 10 patients and then add them all together (as in this example Can I take a mean of a set of means?)

That should (I think) provide me with an overall mean of the conduction velocity of this region of the heart from these 10 patients. But how do I then calculate the SD of this mean of means for when we want to present this data in a medical journal? I would imagine it wouldn't work exactly the same way as a weighted mean?

Because questions like this come up often, I provided a full, general answer in the duplicate thread. In your case it is important to know exactly how the means and SDs were computed. (Many formulas exist for both, depending on how these groups were selected, what statistical assumptions are being made, and the purposes of computing the statistics.) Also, "SD of the mean of the entire population" makes no sense: the population has a unique mean, by definition, and that's not random, so it has no SD. — whuber, Jan 14 '21 at 14:59
So in an attempt to be succint/not ramble I simplified, but it's not really a "population" in that sense. The 10 groups are actually 10 patients undergoing heart procedures, and the sample size for each patient is actually a collection of measurements from their heart (conduction velocity in cm/sec), but a sample of measurements from this region, as opposed to all measurements that were possible. Thus n for each is actually number/sample of measurements taken from a specific region in the heart, and mean/SD of those measurements, but if you're doing mean of means, there would also be SD? — KS87, Jan 14 '21 at 15:29
It sounds like you haven't accurately described your situation or objectives. The "population" of interest may be a set of *people* and what you seem to be seeking is an estimate of some characteristics of those people based on multiple measurements per person. Instead, your current question views the *measurements* as coming from a population and an answer to it would likely give you information that is meaningless (or misleading) when applied to people. It therefore is crucial to explain what your data mean and what you are trying to achieve with these statistics. — whuber, Jan 14 '21 at 15:35
Yes you are right, your second sentence is exactly the case - I didn't realise it at the time (not very experienced with stats). How would you suggest I go about trying to find the answer? Ask a new question with the relevant details? — KS87, Jan 14 '21 at 16:32
Sure I could do that. I suggested new because you had closed this one - would you reopen it once edited? — KS87, Jan 14 '21 at 17:11
The edit will cause users to vote to reopen it. The advantage of this approach is to retain the comment thread. — whuber, Jan 14 '21 at 17:24
OK thanks, I have edited the question now - fingers crossed someone can help! — KS87, Jan 14 '21 at 20:22
I'm not sure I understand your goal. But how about using the weighted mean $$\bar X_c = \frac{n_1\bar X_1 + n_2\bar X_2 +\cdots +n_{10}\bar X_{10}} {n_1 + n_2+\cdots + n_{10}},$$ the pooled variance $$S_c^2 = \frac{(n_1-1)S_1^2 +(n_2-1)S_2^2+ \cdots + (n_{10}-1)S_{10}^2} {n_1+n_2+\cdots n_{10} - 10},$$ and then $S_c=\sqrt{S_c^2}?$ — BruceET, Jan 16 '21 at 08:44
Hi, I think this probably is the answer! Working out SD from variances seems a little bit of a roundabout way of doing things, given that I initially start with SD, have to convert to variance (presumably by squaring?) then work out pooled variance, and then square root the pooled variance for pooled SD? But if it gets the right answer, then I'm happy! Am I right in saying that going from SD to variance requires squaring the SD? — KS87, Jan 16 '21 at 16:48

Calculating a weighted mean and weighted SD from multiple mean/SD/different sample sizes?

0 Answers0