I have a number of measurement samples of which some have 2 measurements and some have 3. I wish to make the most accurate estimation of population variance I can, and understand that ignoring data is taboo.
[Edit: More specifically, I have measured several different things through the same process. I expect them to have different means (due to different circumstances), but for each source to be normally distributed, and suspect that the nature of introduced noise (source of variance).
My understanding is that if they have the same (population) variance then that population variance should be used for confidence intervals rather than the sample variances, since by Basu's and Cochran's theorems {sample mean distance from population mean} and sample variance are independent as a special characteristic of normal distributions.
Each sample has two or three measurements--for which I have two or three numbers believed to be from the same normal distribution--and I wish to check the likelihood of the sample variances originating from the same population variance. To do this (scaled chi-squared distribution comparison), I first want to estimate the population variance.
My current suspicion is now that I should calculate one estimate for n=2 and one estimate for n=3, each group of sample variances (of the same sample size) corrected by multiplying by $\frac{n}{n - 1 + \frac{2}{SampleNumber_n}}$ before averaging. After getting two numbers, I should then do an inverse-MSE weighted average (using MSE as the variance relative to the population mean), where the 'MSE' term is according to the equation in the answer to 'Estimate of variance with the lowest mean square error' linked below, with the population-variance term cancelling in numerator or denominator.
However, though this intuitively sounds hopeful, I remember in the past often suspecting things which made intuitive sense and then turned out to in fact be incorrect. Beyond that, my streams-of-consciousness (trains of thought) can be hard to follow I am aware, and I feel that even if I were correct about this, it would be better for me to cite an authoritative source that this is a valid course of action rather than try to justify it with a shaky foundation--either that or, again if it were correct, I should improve my understanding to the point where I can be confident in explaining why it must be true.]
Relevant links are (Bessel's correction Caveats and Estimate of variance with the lowest mean square error.
I understand that $MSE = Var + Bias^2$ .
If I understand correctly, $Var(Mean) = \frac{Var}{SampleNumber}$ .
If I understand correctly, $MSE(Mean) = \frac{Var}{SampleNumber} + Bias^2$ , such that as sample number approaches infinity the distance of the mean from the population mean approaches the Bias.
In the linked question's answer, for a single sample variance, $\text {Var}(s_d^2) = 2\sigma^4(n - 1) / d^2$ . However, if I understand correctly, if SampleNumber sample variances with the same degree of freedom were averaged together, the equation for the mean's $Bias^2$ in the mean's MSE equation would be the same, whereas the mean's $Var$ would instead be $\sigma^4(n - 1) / d^2$ , and the end equation becomes $d = n - 1 + 1$, or more generally $d = n - 1 + \frac{2}{SampleNumber}$.
Returning to my situation of choosing corrections for and combining (averaging?) sample variances, if I used the $n-1$ correction to make each estimator unbiased then the course of action would be straightforward.
$$SumOfSquaresFromPopulationMean = SumOfSquaresFromSampleMean + n*(SampleMeanFromPopulationMean)$$
When the distance of the sample mean from the population mean (the bias) is 0, I can add the two sums of squares directly, then divide them by the total sample number to get a mean with proportionally-shrunk variance, without worrying abou the $Bias^2$ terms which are both equal to 0.
However, if I try to see what happens if I attempt the same for lowest-MSE corrections, I get in this sort of a tangle.
$$SumOfSquaresFromPopulationMean = SumOfSquaresFromNewSampleMean + n*(DistanceOfNewSampleMeanFromOldSampleMean) + n*(DistanceOfOldSampleMeanFromPopulationMean)$$
--which gets more of a tangle the more I try to expand it (and add the terms for the two distributions).
What to do?