Sampling distribution of average of some covariance matrices

Question

I have $K$ datasets, each with $N$ variables and $M$ samples (they are in fact EEG time series, but I discard time and treat them as $K$ iid multivariate samples) and assume they are coming from the same multivariate normal distribution.

I am interested in estimating the covariance matrix. Now it can be done in two ways:

Concatenating the datasets together, and calculating the covariance matrix. Its sampling distribution would be Wishart, given the assumptions of multinormality of samples.
Calculating the covariances separately for each dataset and averaging those matrices (with arithmetic mean) to form one total covariance.

The first method is straightforward and have well established properties, but the second is in most cases much more feasible in my environment.

From properties of variance of Wishart distribution $\Big($S.E. of element $C_{i\,j}$ of covariance matrix $C$ equals $\sqrt{\frac{C_{i\,j}^2 + C_{i\,i} C_{j\,j}}{M-1}}$ $\Big)$ and CLT (Central Limit Theorem) I can see, that both expected value and standard error of estimate of covariance matrix should agree for both methods.

But yet, (obviously) the methods don't generate numerically the same covariances.

Is it really true, that both ways of estimating covariance matrix have the same standard error?
Does Wishart distribution behavior can be approximated by normal distribution when sample size parameter goes to infinity (just like we do for chi squared distribution)? If so, what are the conditions to have reasonably good approximation?
I bet, that if one can approximate matrix elements of Wishart distribution with normal distribution, than the validity of the second method depends on validity of this approximation. But please, can someone correct me, if I am wrong?

I need those answers to justify (if it is justifiable at all :-) ) interchangeable use of both estimators in the article about performance of some joint diagonalization of $C$ algorithms.

Does each of your datasets contain the same number of samples? — onestop, Mar 23 '12 at 13:56
Were you able to solve this problem? I have a similar problem and I am wondering whether it is safe to average the covariance matrices — Wis, May 31 '17 at 16:24
If you truly are interested just in standard errors of the components of these matrices, then your question easily reduces to the one-dimensional case. The answers in that case are intuitively clear: the degrees of freedom you lose by separately estimating $K$ covariances inflate the standard error compared to combining all the data. When $K\ll M$ that's probably not worth worrying about. — whuber, Jan 15 '19 at 16:42

Sampling distribution of average of some covariance matrices

0 Answers0