5

A zero entry in the precision matrix (the inverse of the covariance matrix) means the corresponding variables are independent given all the other variables. For real-world data samples, when is an entry in the precision matrix small enough to be treated as a zero?

In my data-sample, if I adjust the precision matrix so all values< 0.004 are zero, the corresponding correlations do not change significantly. I got tho this value by trial and error: setting the threshold to 0.005 does cause significant changes in correlations.

The threshold value for precision matrix entries, below which variables can be considered conditionally independent, depends on sample size, on the number of variables (the size of the matrix) and on the other values in the matrix. Is there any way other than trial and error to find it?

whuber
  • 281,159
  • 54
  • 637
  • 1,101
Ivana
  • 552
  • 2
  • 12

1 Answers1

3

Finding the covariance matrix that fits the data and has a conveniently large number of zero entries in it's inverse matrix is known as Covariance Selection (1). Zeros in the inverse covariance matrix are desirable both for computational and conceptual reasons: they indicate conditional independence between variables, making the model smaller.

Covariance selection is an active field of research with applications in domains from proteomics to economics. Several algorithms have been proposed and implementations are available in statistical software, for example glasso and smac in R. This presentation (2) provides a really nice overview.

Setting small values to zero does the same thing, but the glasso algorithm finds more zeros. For my particular problem of only 9 variables i could change 3 (of 36) connections to 0, while glasso found 9.

1) Dempster, A. P. Covariance Selection Biometrics, 1972, 28, 157-175

2) P. Olsen, F. Oztoprak, J. Nocedal and S. Rennie; Sparse Inverse Covariance Estimation; Summer Tutorial at IBM TJ Watson Research Center 2012

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Ivana
  • 552
  • 2
  • 12
  • 2
    I've never used this procedure myself, but I'd be concerned that by setting some values in $\Sigma^{-1}$ to $0$, the covariance matrix $\Sigma$ would no longer be positive definite, and therefore would not be invertable, and breaks the relationship between covariance and precision matrices. Does your research shed any light on this problem? – Sycorax Aug 27 '15 at 14:13
  • Unfortunately it's trail-and-error: i set a threshold below which values are turned to 0, then i invert the matrix back, recalculate the correlation matrix and check if the differences are acceptable. This ensures a PD matrix, but finds less conditional independencies than better algorithms. I whish i understood the stability of matrix inversion better, i asked a related question about that: http://stats.stackexchange.com/questions/145412/stability-precision-matrix-under-small-changes-in-covariance?rq=1 – Ivana Aug 28 '15 at 11:27