43

I use mostly "Gaussian distribution" in my book, but someone just suggested I switch to "normal distribution". Any consensus on which term to use for beginners?

Of course the two terms are synonyms, so this is not a question about substance, but purely a matter of which term is more commonly used. And of course I use both terms. But which should be used mostly?

Harvey Motulsky
  • 14,903
  • 11
  • 51
  • 98
  • 1
    Is there a preview section/sample chapter of your book available somewhere? I hear good things about it. – Glen_b Sep 09 '14 at 06:53
  • 2
    @Glen_b The "Look inside" feature of amazon.com lets you preview the book. Also, three chapters are available here: http://www.intuitivebiostatistics.com/excerpts/ – Harvey Motulsky Sep 09 '14 at 11:24
  • 6
    The issue of "which term **is** more commonly used" can easily be addressed, albeit crudely: A Google search of "Gaussian" distribution has about 2/3 of the hits of a search for "normal distribution." The ratio is a little different on Google Scholar, where now "Gaussian distribution" has half the hits of "normal distribution" (but only a quarter when "inverse" is excluded). These results suggest (1) "normal" is more popular but (2) "Gaussian" is widely recognized. Looking at the results suggests that phrases like "asymptotically normal" may take a long time to be replaced, if ever. – whuber Sep 09 '14 at 14:17
  • 2
    In extension of @whuber, I think there's also a difference between fields: "Gaussian" seems relatively more predominant in science, whereas "Normal" seems to be the *normal* term in social science ... – abaumann Sep 10 '14 at 20:25
  • 1
    Try "abnormal" :P – user541686 Sep 11 '14 at 09:03
  • In my experience beginners often confuse *normal* and *uniform*. Since I quite often have to ask if somebody wants a Gaussian or uniform distribution when they say normal, I prefer the unambiguous Gaussian. – CodesInChaos Sep 11 '14 at 10:08

9 Answers9

48

Even though I tend to say 'normal' more often (since that's what I was taught when first learning), I think "Gaussian" is a better choice, as long as students/readers are quite familiar with both terms:

  • The normal isn't particularly typical, so the name is itself misleading. It certainly plays an important role (not least because of the CLT), but observed data is much less often particularly near Gaussian than is sometimes suggested.

  • The word (and associated words like "normalize") has several meanings that can be relevant in statistics (consider "orthonormal basis" for example). If someone says "I normalized my sample" I can't tell for sure if they transformed to normality, computed z-scores, scaled the vector to unit length, to length $\sqrt{n}$, or a number of other possibilities. If we tended to call the distribution "Gaussian" at least the first option is eliminated and something more descriptive replaces it.

  • Gauss at least has a reasonable degree of claim to the distribution.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • 3
    +1 for the bit "as long as students/readers are quite familiar with both terms". I think it'd be a disservice to students to teach *only* "Gaussian", just because "normal" is so widespread. – Patrick Coulombe Sep 09 '14 at 05:15
  • 6
    I agree that we have to teach both. If we were starting from scratch, and knew what we know now, we would never allow "normal" to emerge, because (1) the term is overloaded any way (2) the normal (Gaussian) is not normal (usual or expected) of data. "Gaussian" is the most common alternative, even though there is a history before Gauss. E.T. Jaynes suggested "central", which is a droll idea, but it hasn't caught on. (I recognise the arguments that are based on the central limit theorem.) – Nick Cox Sep 09 '14 at 08:07
  • As regards bullet #2, when it comes to broader science and mathematics as a whole, it's not necessarily clear whether "[normal](http://en.wikipedia.org/wiki/Normal)" or "[Gauss](http://en.wikipedia.org/wiki/List_of_things_named_after_Carl_Friedrich_Gauss)" is more common. ;-) – cardinal Sep 11 '14 at 00:45
  • @cardinal - I quite agree with the suggestion that it tends to lean much more toward "Gaussian" in those areas - and I would add engineering as well. – Glen_b Sep 11 '14 at 00:48
  • 1
    @Glen_b: Agreed. (In my mental model, I include engineering under the general umbrella of science, though that is, perhaps, somewhat outside the, ahem, norm.) :-) – cardinal Sep 11 '14 at 00:58
  • @cardinal Don't let Dr. Sheldon Cooper hear you say that. – Code-Guru Sep 11 '14 at 22:12
  • So, for your second bullet point, would you suggest *Gaussianize*? – ruakh Sep 12 '14 at 07:08
  • @ruakh Well, it would be easily understood, I think, though there are other possibilities. – Glen_b Sep 12 '14 at 07:29
  • @Glen_b: Sorry, I guess I phrased that as a yes-or-no question, but what I really meant to ask was: "So, for your second bullet point, what would you suggest instead of *normalize*?" – ruakh Sep 13 '14 at 02:33
  • I don't necessarily suggest one, but if I had to pick one then Gaussianize would a pretty obvious choice. The reason I didn't say "we'd just use Gaussianize" is that statistics seems full of odd choices of terminology, so even if we did all call it the Gaussian distribution, we'd probably settle on something else instead of "Gaussianize" ... and I'd shrug and go with whatever that was. – Glen_b Sep 13 '14 at 02:58
36

I would use Gaussian.

One problem that faces people learning statistics is that we use everyday English words to mean different things (power, significant, distribution etc). To the extent that we can minimize this, we should. "Normal" already has a bunch of meanings.

Peter Flom
  • 94,055
  • 35
  • 143
  • 276
  • 2
    Peter: I agree. That is why I have always used "Gaussian". But a comment from a reviewer on the new (concise) edition, strongly pushed "normal". – Harvey Motulsky Sep 09 '14 at 00:58
25

One argument in favor of normal is the entrenched $N(\mu, \sigma^2)$ notation for the distribution, in which $N$ stands for "normal". I haven't seen anyone propose changing this to $G(\mu, \sigma^2)$.

Nate Eldredge
  • 359
  • 2
  • 4
  • 1
    $G$ would probably conflict with Gamma as well, which should be denoted $\Gamma$ but unfortunately that's taken by the function of the same name. An alternative might be $Gauss$ or $Gaussian$, which would also be consistent with $Bernoulli$ and the frequent abbrevation of $binomial$ to $binom$. But I actually like the $N$ notation, because I write it constantly and it's an easy letter to scribble. – shadowtalker Sep 09 '14 at 04:29
  • That's a fair point, though if both terms are presented, the use of $N$ can be introduced then. – Glen_b Sep 09 '14 at 10:17
  • 8
    Let $G \sim {\cal N}(\mu, \sigma^2)$ ;-) – Stéphane Laurent Sep 09 '14 at 13:10
  • 1
    @StéphaneLaurent: I guess my point is that if you avoid the word "normal", students may have a hard time remembering what $N(\mu,\sigma^2)$ means, since it would no longer be mnemonic. – Nate Eldredge Sep 09 '14 at 14:38
10

In German it is often called Gaußsche Normalverteilung so it is nearly impossible to conflict easily.

Would it be appropriate for you to combine gaussian and normal?

gismo141
  • 201
  • 1
  • 3
9

According to the Wolfram encyclopedia:

While statisticians and mathematicians uniformly use the term "normal distribution" for this distribution, physicists sometimes call it a Gaussian distribution and, because of its curved flaring shape, social scientists refer to it as the "bell curve."

I agree that "normal" is easier to confuse - yet I suspect statistics books usually use "normal".

amoeba
  • 93,463
  • 28
  • 275
  • 317
Gerenuk
  • 1,833
  • 3
  • 14
  • 20
  • +1 for an answer that's descriptive rather than prescriptive. I actually agree with the other answers that Gaussian is preferable, no matter what field, but it's informative to start from the context of what's widespread in existing usage. – R.. GitHub STOP HELPING ICE Sep 10 '14 at 03:43
  • As for the phrase "bell curve", I would avoid it **entirely** in any teaching setting. It has highly racist overtones as a result of the infamous book by the same name, and any of your students who are aware of it are likely to be distracted by it and associate whatever you're saying with nonsensical theories about racial superiority rather than having the subject stand on its own. – R.. GitHub STOP HELPING ICE Sep 10 '14 at 03:46
  • @R.. Descriptive, yes, but that description is directly contradicted by the answers here, which indicate that a significant fraction of statisticians and mathematicians actually use the term "Gaussian". – David Richerby Sep 11 '14 at 09:43
  • Another reason for *not* using the term "bell curve" to denote the (density function of the) Gaussian/normal distribution is that there are *many distributions* whose probability density function (pdf) resembles a bell curve. Even the pdf of a Cauchy distribution looks like a bell curve! – Mico Sep 14 '14 at 15:03
  • +1 for explaining relative terms in different disciplines. Thanks! – StatguyUser Jun 12 '16 at 17:17
7

I'd like to point out that S. Stigler uses Normal / Gauss / Laplace-Gauss distribution to prove 'Stigler's law of eponymy' published in Statistics on the Table (some pages are available on books.google).

Particularly interesting and relevant to this questions is that on pg 287-288 there are tables of the historical usage of 'Normal' vs 'Gauss' vs 'Laplace' and it seems that over the years the usage shifted from 2:15 in favor of normal in 1816-1884 to 8:14 (1888-1917) to 5:17 (1919-1939) to 9:10 (1947-1976).

So according to this the usage of 'normal' vs 'Gauss' is getting more equal. Or if you believe that the trend will continue then 'Gauss' will beat 'normal' in 50-100 years.

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
pes
  • 169
  • 5
5

An answer I haven't seen yet among all the good answers:

I mostly use "normal" for reasons of previous familiarity, but I like to capitalize it to emphasize its technical meaning: "... if the data are Normally distributed ..." (I don't know whether I copied this practice from somewhere else or (re-)invented it myself)

Ben Bolker
  • 34,308
  • 2
  • 93
  • 126
5

Which to use depends on the level of statistics being taught. Unfortunately, my teaching experience indicates that the majority of undergraduate students never fully grasp the concept of a probability distribution. However, they all must somehow come to grips with the CLT and ways to think about uncertainty. For an undergraduate class, Normal is preferable because it doesn't add the anxiety of a new unfamiliar word. For graduate students Gaussian is preferred because of all the above mentioned confusion over normalization and the historical context that it provides. I teach a an undergraduate research class requiring two prerequisite statistics classes and all the undergraduate books that I have seen used over the last 30 years have used Normal.

TJ Olney
  • 51
  • 1
  • 1
    "the majority of undergraduate students never fully grasp the concept of a probability distribution" +1 – Code-Guru Sep 11 '14 at 22:16
4

The name normal came from some of the observations that errors behave normally. You will find more details here. If that is the reason to call this distribution a normal distribution, it may create new confusion as normal distribution for counts of accidents is poisson. I believe we should move forward and start calling it a Gaussian instead.