4

In Chebyshev's inequality, we can generalize the 68-95-99.7 rule from normal distributions to bound how much density is within a certain number of standard deviations from the mean.

$$ P\big( \big\vert X-\mu \big\vert \ge k\sigma \big)\le\dfrac{1}{k^2} $$

In a multivariate distribution, can we do something similar with Mahalanobis distance substituted for $\sigma$? I would expect the inequality to involve the dimension of the multivariate $X$ random variable and turn into the usual Chebyshev inequality when $X$ is univariate.

Dave
  • 28,473
  • 4
  • 52
  • 104

1 Answers1

2

Suppose $\boldsymbol X$ is a $p$-dimensional random vector with mean vector $\boldsymbol \mu$ and dispersion matrix $\Sigma$.

If $\Sigma$ is positive definite, then we can write $\Sigma=BB^T$ for some nonsingular matrix $B$. Using the transformation $\boldsymbol X\mapsto B^{-1}(\boldsymbol X-\boldsymbol\mu)=\boldsymbol Y$, we have

$$(\boldsymbol X-\boldsymbol \mu)^T\Sigma^{-1}(\boldsymbol X-\boldsymbol \mu)=\boldsymbol Y^T\boldsymbol Y=\sum_{i=1}^p Y_i^2$$

Clearly the $Y_i$'s have zero mean and unit variance for every $i$.

Using Markov's inequality, for $k>0$,

$$P\left(\sum_{i=1}^p Y_i^2 \ge k^2\right)\le \frac{E\left(\sum_{i=1}^p Y_i^2\right)}{k^2}$$

In other words,

$$P\left((\boldsymbol X-\boldsymbol \mu)^T\Sigma^{-1}(\boldsymbol X-\boldsymbol \mu)\ge k^2\right)\le \frac{p}{k^2}$$

Or,

$$P\left(\sqrt{(\boldsymbol X-\boldsymbol \mu)^T\Sigma^{-1}(\boldsymbol X-\boldsymbol \mu)}\ge k\right)\le \frac{p}{k^2}$$

StubbornAtom
  • 8,662
  • 1
  • 21
  • 67