0

Suppose I took $M$ samples from a random vector $X_1, X_2, \ldots X_N$ (where $N<M$) and I calculated a covariance matrix $S$ like below, where the reddier the color, the higher the value (for illustrative purposes it was taken from Nilearn):

Taken from: Nilearn

No assumption is made regarding the joint distribution (but certainly signals are not white noise), and $M \approx 100$. I want to prove that the covariance in that matrix is negligible using a statistical test. I can see some approaches: ANCOVA, Hotelling $T^2$ test adapted for the eigenvalues of the covariance matrix or some of the methods mentioned here. Is there a simpler way to prove it?

JMFS
  • 131
  • 6
  • 1
    One of the problems with inference on eigenvalues—even of perfectly uncorrelated data—is that they (unlike the vectors) are *strongly* dependent. – Alexis Mar 21 '18 at 19:40
  • You'll need to define "negligible" carefully; "0" is easy to define, and there's only one way the correlations can be 0, but "negligible" can be defined in many ways, e.g., $\max |\rho_{ij}| < 0.05$, $\sum |\rho_{ij}| < 100$, the determinant of $S$ / the determinant of $\text{diag } S$ is less than some number,... – jbowman Mar 21 '18 at 19:47
  • I see, $|S|/|\text{diag}(S)| \to 1$, is a way of measuring how diagonal the matrix is, although the way Pearson Coefficients are employed [here](https://math.stackexchange.com/questions/1392491/measure-of-how-much-diagonal-a-matrix-is) may be another approximation. – JMFS Mar 21 '18 at 20:20
  • @juanma2268 BobDurrant pointed out an error in my answer (I forgot *covariances*, rather than variances, which makes the needed tests two-sided). I have amended my answer... a tad more complicated, but doable, I think. – Alexis Mar 21 '18 at 23:44
  • 1
    And I am going to need a big cup of cocoa to digest it, haha. – JMFS Mar 22 '18 at 02:54

1 Answers1

1

Statistics generally don't prove in the sense of 100% certainty. They provide evidence. A way to provide such evidence in for your purposes is via re-randomization of your data:

Choose a type I error rate ($\alpha$).

Choose $\delta$—the smallest positive number that you would consider relevantly difference from zero (i.e. maybe .002 is, for your purposes effectively zero).

Create, say, 9,999 data sets by re-randomizing (shuffling) each variable independently of each other variable (thus you get an M-by-N data set with precisely the same univariate distributions, but covariances due entirely to chance). Your originally observed data is the 10,000th data set.

Estimate the covariance matrix $\mathbf{\Sigma}$, and extract the $\frac{N^2-N}{2}$ covariances $\sigma_{ij}$.

Perform $\frac{N^2-N}{2}\times 10,000$ tests of the form:

$H^{-}_{0}: |\sigma_{ij}| \ge \delta$
$H^{-}_{A}: |\sigma_{ij}| < \delta$

Which require two one-sided tests to actually do the inference:

$H^{-}_{01}: \sigma_{ij} \ge \delta$
$H^{-}_{A1}: \sigma_{ij} < \delta$

$H^{-}_{02}: \sigma_{ij} \le -\delta$
$H^{-}_{A2}: \sigma_{ij} > -\delta$

Your p-value ($p_{1}$) for a test of $H^{-}_{01}$ for a single $\sigma_{ij}$ is the number of rejections of its $H^{-}_{01}$ (the number of times $\hat{\sigma}_{ij}\le\delta$) divided by 10,000.

Your p-value ($p_{2}$) for a test of $H^{-}_{02}$ for a single $\sigma_{ij}$ is the number of rejections of its $H^{-}_{02}$ (the number of times $\hat{\sigma}_{ij}\ge-\delta$) divided by 10,000.

Perform the Benjamini-Hochberg false discovery rate adjustment for multiple comparisons for your $\frac{2\left(N^2-N\right)}{2} = N^2-N$ tests (let's call this number $m$ for a moment) by:

  1. Ordering the p-values (both $p_{1}$ and $p_{2}$) from largest to smallest (and retaining which p-value goes with which covariance you are testing)
  2. Calculate $\alpha^{*}_{i} = \frac{\alpha\times(m+1 -i)}{m}$, where $i$ is the number of the ordered p values. ($\alpha$ not $\alpha/2$ because the join TOST null's are non-intersecting)
  3. In order from smallest to largest, compare $p_{i}$ and $\alpha^{*}_{i}$, and if both $p_{1}\le \alpha^{*}_{i}$ and $p_{2}\le \alpha^{*}_{i}$ (for their respective $i$s) reject $H^{-}_{0i}$, and all remaining $H^{-}_{0>i}$ for which both $H^{-}_{01}$ and $H^{-}_{02}$ are rejected at $i$ or later. Stop.

If you reject $H^{-}_{0}$ for a covariance $\sigma_{ij}$, you conclude that that covariance is equivalent to zero, given your preferred type I error rate $\alpha$ and your equivalence/relevance threshold $\delta$.

Remember: you must choose $\delta$ and $\alpha$ a priori... otherwise you are in p-hacking territory. NB: if your preferred type I error rate is really tiny ($\alpha=0.0001$, say), you will want to increase the total number of data sets accordingly. For example, it is difficult to reliably estimate 0.0001-sized probabilities in a sample size of 10,000.

Alexis
  • 26,219
  • 5
  • 78
  • 131
  • Thanks. By re-randomization do you mean to change the order of the random variables (for example, $X_2, X_1, \ldots X_N$) and then generate $M$ new experiments for each combination? – JMFS Mar 21 '18 at 20:40
  • Probably that should be $|\sigma_{ij}|$ for the test statistic? Also maybe better to look at the (absolute) correlations rather than the covariances, since they are scale-independent? – Bob Durrant Mar 21 '18 at 20:42
  • @juanma2268 No: I mean shuffle the values *within* each variable. So you are effectively creating a bunch of new data sets. – Alexis Mar 21 '18 at 23:28
  • 1
    @BobDurrant Damnit! You are *totally* correct. I saw sigma and started thinking standard deviation.... I will revise to equivalence test, and good catch! – Alexis Mar 21 '18 at 23:29
  • @Alexis many thanks. Have you used this method before in a publication so I can cite you more properly? – JMFS Mar 22 '18 at 03:13
  • @juanma2268 Not *per se*, it combines a permutation test, with equivalence tests, with control of the false discovery rate, and there are good citations for each of these. I think there's one tiny niggle left: for step three, you would reject all remaining $H^{-}_{0}$ for which ***both*** – Alexis Mar 22 '18 at 04:15