19

Could anyone give me some practical examples of the Cauchy Distribution? What makes it so popular?

Glorfindel
  • 700
  • 1
  • 9
  • 18
Daria
  • 375
  • 2
  • 11
  • 5
    I challenge the premise -- is it actually popular as a practical model\*? (If it is, how do you know, outside of seeing practical examples already?) ... $\:$ \*[It's widely used in textbook examples because of its simplicity and as a counterexample to various things, but I doubt those count as practical. It's sometimes used as a prior, but that's not as a data model.] – Glen_b Jul 07 '19 at 09:30
  • I've seen some practical examples out of my field of studies, specifically for MCMC algorithm. Therefore I've been curious if it can be applied for finance or ML – Daria Jul 07 '19 at 09:32
  • When you say "for MCMC algorithm" do you mean instead "as a Bayesian prior" or do you mean "as a model for data in a Bayesian framework" or something else? – Glen_b Jul 07 '19 at 09:34
  • For computing hierarchical prior and reference prior. – Daria Jul 07 '19 at 09:40
  • 2
    Its [use as a prior](https://stats.stackexchange.com/search?q=cauchy+prior) is because of the distribution's properties (in general, the aim is to give some kind of weakly informative prior); from the wording of the question I wouldn't have thought you meant to include priors. There's a somewhat related question here: [What are the properties of a half Cauchy distribution?](https://stats.stackexchange.com/questions/237847/what-are-the-properties-of-a-half-cauchy-distribution) – Glen_b Jul 07 '19 at 09:59
  • See https://stats.stackexchange.com/a/36037/919 for an illuminating description of the Cauchy distribution. – whuber Jul 07 '19 at 13:25

2 Answers2

29

The standard Cauchy distribution is derived from ratio of two independent normally distributed random variables. If $X \sim N(0,1)$, and $Y \sim N(0,1)$, then $\tfrac{X}{Y} \sim \operatorname{Cauchy}(0,1)$.

The Cauchy distribution is important in physics (where it’s known as the Lorentz distribution) because it’s the solution to the differential equation describing forced resonance. In spectroscopy, it is the description of the shape of spectral lines which are subject to homogeneous broadening in which all atoms interact in the same way with the frequency range contained in the line shape.

Applications:

  • Used in mechanical and electrical theory, physical anthropology and measurement and calibration problems.

  • In physics it is called a Lorentzian distribution, where it is the distribution of the energy of an unstable state in quantum mechanics.

  • Also used to model the points of impact of a fixed straight line of particles emitted from a point source.

Source.

Matthew Anderson
  • 432
  • 4
  • 12
  • Thank you. The first sentence is pretty helpful. I am quite far from the physics, could you give any examples considering finance or machine learning? – Daria Jul 06 '19 at 20:51
  • 2
    It's not really used in finance or machine learning (practically); it's used in physics (99.9% of the time). I suppose that if someone wanted to model the ratio between two independent, normally distributed variables in finance, they would use the Cauchy distribution. – Matthew Anderson Jul 06 '19 at 20:53
  • 2
    A reason it could be useful in finance is that it has extremely heavy tails. It has no moments, so it doesn’t make sense to say that it has high kurtosis, but it is prone to extreme observations, both high and low. – Dave Jul 06 '19 at 21:06
  • 7
    It _is_ used in machine learning, in particular as a prior distribution in Bayesian inference. In particular the half-Cauchy is used as a prior for certain scale variables. – Wayne Jul 06 '19 at 21:42
  • 2
    @Wayne Could you please give an example, maybe a reference? – Dave Jul 06 '19 at 22:21
  • @Dave https://stats.stackexchange.com/questions/237847/what-are-the-properties-of-a-half-cauchy-distribution – Wayne Jul 07 '19 at 02:53
  • Thank you. It helped a lot. However I lack an example that will show its application rather mathematical notations. I do get the whole point of hierarchical priors in bayesian inference, however I am not sure if I understand why Cauchy distribution comes in handy in this case. – Daria Jul 07 '19 at 09:17
  • 1
    "ratio of two independent normal distributions" isn't exactly right. That should say "ratio of two independent normally distributed random variables." – Michael Hardy Jul 07 '19 at 23:39
  • Another place it shows up (in disguise): a t-distribution with 1 degree of freedom is a Cauchy distribution. – AlaskaRon Jul 08 '19 at 07:51
23

In addition to its usefulness in physics, the Cauchy distribution is commonly used in models in finance to represent deviations in returns from the predictive model. The reason for this is that practitioners in finance are wary of using models that have light-tailed distributions (e.g., the normal distribution) on their returns, and they generally prefer to go the other way and use a distribution with very heavy tails (e.g., the Cauchy). The history of finance is littered with catastrophic predictions based on models that did not have heavy enough tails in their distributions. The Cauchy distribution has sufficiently heavy tails that its moments do not exist, and so it is an ideal candidate to give an error term with extremely heavy tails.

Note that this issue of the fatness of tails in error terms in finance models was one of the main contents of the popular critique by Taleb (2007). In that book, Taleb points out instances where financial models have used the normal distribution for error terms, and he notes that this underestimates the true probability of extreme events, which are particularly important in finance. (In my view this book gives an exaggerated critique, since models using heavy-tailed deviations are in fact quite common in finance. In any case, the popularity of this book shows the importance of the issue.)

Ben
  • 91,027
  • 3
  • 150
  • 376
  • Thank you, I highly appreciate your answer as I am familiar with the book. By the way, I am not sure if I understand this part of your sentence correctly " fatness of tails in error terms". Would you mind being more precise with that? – Daria Jul 08 '19 at 14:39
  • https://en.wikipedia.org/wiki/Fat-tailed_distribution#Fat_tails_and_risk_estimate_distortions – 0xFEE1DEAD Jul 08 '19 at 16:20
  • In this kind of general discussion, we do not have a specific tail property in mind, so precision in specifying the meaning of "fatness" or "heaviness" of the tails detracts from the generality. It is worth reviewing some characterisations of [fat-tailed distributions](https://en.wikipedia.org/wiki/Fat-tailed_distribution#Fat_tails_and_risk_estimate_distortions) and [heavy-tailed distributions](https://en.wikipedia.org/wiki/Heavy-tailed_distribution) to see the kind of properties I have in mind. – Ben Jul 08 '19 at 22:02
  • Could you explain what the precision means in the plain English? I mean, I do get that it’s inverse of variance, but I seek understanding why if we talk about priors, we get n0 in the denominator - the prior sample size. – Daria Jul 08 '19 at 22:41
  • Without seeing the context of what you're talking about, what you ask is unclear. May I suggest that you pose this as a new question on this site, with all the relevant context given. – Ben Jul 08 '19 at 23:36