7

I am aware of this question but my issue is about two competing ways of obtaining the 2D covariance error ellipse in two competing answers over at StackOverflow.

The first answer obtains the width and height of the ellipse as:

$$w=2\times nstd\times \sqrt{\lambda_1} \;\;;\;\; h=2\times nstd\times \sqrt{\lambda_2}$$

while the second answer says that:

$$w=2\times \sqrt{\lambda_1\times r2} \;\;;\;\; h=2\times \sqrt{\lambda_2\times r2}$$

where $(\lambda_1, \lambda_2)$ are the eigenvalues of the covariance matrix of the 2D data, $nstd$ is the standard deviation I set for the ellipse (e.g.: $nstd=2$ if I want the 2-sigma ellipse), and $r2$ is the chi-square percent point function for that $nstd$.

The first answer, in blue below, always gives a smaller ellipse (the rotation angle is equal for both answers). Which is the proper way of obtaining the covariance ellipse?

enter image description here

import numpy as np
from scipy.stats import norm, chi2
import matplotlib.pyplot as plt
from matplotlib.patches import Ellipse


def main(nstd=2.):
    """
    Generate an `nstd` sigma ellipse based on the mean and covariance of a
    point "cloud".
    """

    # Generate some random, correlated data
    cov = np.random.uniform(0., 10., (2, 2))
    points = np.random.multivariate_normal(mean=(0, 0), cov=cov, size=1000)

    # The 2x2 covariance matrix to base the ellipse on.
    cov = np.cov(points, rowvar=False)

    # Location of the center of the ellipse.
    mean_pos = points.mean(axis=0)

    # METHOD 1
    width1, height1, theta1 = cov_ellipse(points, cov, nstd)

    # METHOD 2
    width2, height2, theta2 = cov_ellipse2(points, cov, nstd)

    # Plot the raw points.
    x, y = points.T
    ax = plt.gca()
    plt.scatter(x, y, c='k', s=1, alpha=.5)
    # First ellipse
    ellipse1 = Ellipse(xy=mean_pos, width=width1, height=height1, angle=theta1,
                       edgecolor='b', fc='None', lw=2, zorder=4)
    ax.add_patch(ellipse1)
    # Second ellipse
    ellipse2 = Ellipse(xy=mean_pos, width=width2, height=height2, angle=theta2,
                       edgecolor='r', fc='None', lw=.8, zorder=4)
    ax.add_patch(ellipse2)
    plt.show()


def eigsorted(cov):
    '''
    Eigenvalues and eigenvectors of the covariance matrix.
    '''
    vals, vecs = np.linalg.eigh(cov)
    order = vals.argsort()[::-1]
    return vals[order], vecs[:, order]


def cov_ellipse(points, cov, nstd):
    """
    Source: http://stackoverflow.com/a/12321306/1391441
    """

    vals, vecs = eigsorted(cov)
    theta = np.degrees(np.arctan2(*vecs[:, 0][::-1]))

    # Width and height are "full" widths, not radius
    width, height = 2 * nstd * np.sqrt(vals)

    return width, height, theta


def cov_ellipse2(points, cov, nstd):
    """
    Source: https://stackoverflow.com/a/39749274/1391441
    """

    vals, vecs = eigsorted(cov)
    theta = np.degrees(np.arctan2(*vecs[::-1, 0]))

    # Confidence level
    q = 2 * norm.cdf(nstd) - 1
    r2 = chi2.ppf(q, 2)

    width, height = 2 * np.sqrt(vals * r2)

    return width, height, theta


if __name__ == '__main__':
    main()
Gabriel
  • 3,072
  • 1
  • 22
  • 49
  • None of your references is relevant to what people would ordinarily understand an "error ellipse" to be. One SO post links to an [outside page that is completely wrong.](http://www.visiondummy.com/2014/04/draw-error-ellipse-representing-covariance-matrix/) Given that confusion, could you explain what your understanding is? Then we could point you to a correct solution. – whuber Aug 06 '18 at 22:02
  • I understand an "error ellipse" as the N-sigma ellipse generated from the mean and covariance of a 2D point cloud. I.e., a 2-sigma error ellipse should contain about 95% of the point cloud. This is what the first answer linked above is supposed to give. The completely wrong method you mention, is the base of the second answer and I'm not sure why it is wrong either. – Gabriel Aug 07 '18 at 01:59
  • Also, what is the proper definition of "error ellipse" if that is not the one? I could not find it. – Gabriel Aug 07 '18 at 02:10
  • Many statisticians would understand "error ellipse" to be a *confidence region* for the center of the points (the arithmetic mean of the distribution from which they are drawn). That is strongly implied by the use of the term "confidence" in some of the references. Other possible meanings include some kind of ellipse (such as smallest area) that covers 95% of the points; a *tolerance region* for 95% of the underlying distribution; a *prediction region* for a future random point; and more. The formulas you quote differ because they concern two such different meanings. – whuber Aug 07 '18 at 11:40
  • So is either method incorrect, or do they just differ in what they estimate? Because the first method talks about the region that covers N-sigma or X% of the points, which would be an acceptable meaning. And the second method talks about a "confidence ellipse" which would also be acceptable. – Gabriel Aug 07 '18 at 12:54

1 Answers1

5

I believe I've found the reason for the discrepancy between these two methods. Both seem to be correct, they just estimate different statistical concepts.

The first method describes an error ellipse, characterized by some number of standard deviations.

The second method describes a confidence ellipse, characterized by some probability value.

The difference between these two is explained in this old paper. This question (its most upvoted answer actually) are also related to this issue.

Gabriel
  • 3,072
  • 1
  • 22
  • 49