28

When I first started learning statistics, procedures like the t-test, ANOVA, chi-squared and linear regression each appeared to be very different creatures. But now I realise these procedures each do more or less the same thing. And likewise, values such as the variance, residuals, standard error and mean also measure more or less the same thing.

So I reckon all of these procedures and values, and indeed all of statistics, can be described in just one simple sentence:

What is the expected value and what is the variation around this value?

The word expected could be replaced by any of these words: hypothesised, predicted, or central.

How would other people describe statistics in one sentence?

amoeba
  • 93,463
  • 28
  • 275
  • 317
luciano
  • 12,197
  • 30
  • 87
  • 119
  • 11
    @Trynna This description is *far* too narrow: it characterizes only point estimation. It is like describing mathematics as adding and multiplying numbers--which very well might be the perspective of someone who has studied arithmetic for a few years in school--but falls far short of what the field comprises. – whuber Mar 05 '15 at 22:39

18 Answers18

27

Statistics provides the reasoning and methods for producing and understanding data.

American Statistical Association

whuber
  • 281,159
  • 54
  • 637
  • 1,101
  • 1
    +1 I was trying to come up with an expression of something very close to this notion. I'd have added something about coming to conclusions on the basis of data, but it's not quite so succinct. – Glen_b Mar 05 '15 at 22:40
  • 4
    @Glen You can tell that a lot of thought was put into this characterization. I like having it here somewhere on our site. That, and a similarly pithy description of machine learning, ought to belong in our help pages. – whuber Mar 05 '15 at 22:41
  • Well, quantitative data. – rolando2 Mar 05 '15 at 23:44
  • 2
    I am not sure I agree with the quote (though it is a lovely aspiration). As an epidemiologist, I know that I know things about study design and the production of data and causal inference around same which is outside the ken of many of the fine statisticians around me. Indeed the fancy causal inference for recursive causal graphs originated in three fields not named statistics (epidemiology, computer science, and sociology, as I understand it). Not raising this in a bellicose spirit, but because the quoted sentence describes much of *science*, and doesn't nail down stats *per se*. – Alexis Mar 06 '15 at 00:07
  • @alexis When a psychologist works on psychometry, an economist on econometrics or a scientist on experimental design, aren't they all "doing statistics"? They may not see themselves as statisticians - eg if they see stats as a tool to help in their substantive study goal, not the focus of their work - even though they are more familiar with the tools they use regularly than most professional statisticians. Much of scientific thinking, especially separating signal and noise in observed data, is statistical in nature. – Silverfish Mar 06 '15 at 08:27
  • 3
    The ASA description is much more about statistics as a domain of human knowledge and activity, not marking out who a "statistician" might be. Until WW2 professional statisticians were a rarity, but that doesn't mean stats wasn't being applied in commercial and academic settings. I don't think a good definition of statistics could be limited to what professional statisticians do. – Silverfish Mar 06 '15 at 08:37
  • @Silverfish A very fair point. But the counterfactual causal calculus is more than simply statistics, and entails a formalism of what both causality is and what causal inference is. I see that as beyond statistics, although statistics play a critical and major role in causal estimation. My broader point remains: whuber's quote seems more about science (at least the parts science concerned with testing truth values, as opposed to say mapping and translating meaning, or situating facts within wider bodies of knowledge). – Alexis Mar 06 '15 at 17:06
  • 2
    @Alexis Perhaps there's some difficulty with the *level of understanding* implied by the word "understanding", which the ASA definition leaves rather ambiguous in its brevity. A wider interpretation might be over-encompassing. Certainly if we include substantive physical or social interpretion and underlying mechanisms as part of "understanding", then it goes beyond "mere" statistics. On the other hand, it's not clear to me why inference from data, causal or otherwise, can't lie within the domains of both scientific and statistical endeavour. – Silverfish Mar 06 '15 at 17:59
  • I like the cut of your gib @Silverfish ! – Alexis Mar 06 '15 at 18:52
13

Statistics is fundamentally concerned with the understanding of structure in data.

Bill Venables and Brian Ripley, first sentence in Chapter 1 of Modern Applied Statistics with S

mark999
  • 3,180
  • 2
  • 22
  • 31
  • 3
    This is an interesting take on statistics, albeit a limited one. The possible ambiguities are revealing: a computer scientist would understand "structure in data" in a non-statistical way. (Venables and Ripley work at the intersection of statistics and computing.) – whuber Mar 06 '15 at 23:15
  • @whuber I agree with you. There's nothing to suggest that V&R intended it to be a one-sentence description of all of statistics, but ever since I first read it, I've thought it was a nice description. I interpret "structure in data" as "characteristics of the population from which the sample was taken". – mark999 Mar 07 '15 at 01:32
11

Statistics provides the reasoning and methods for converting data to meaningful information.

IrishStat
  • 27,906
  • 5
  • 29
  • 55
9

In the words of the late Leo Breiman:

The goals in statistics are to use data to predict and to get information about the underlying data mechanism.

http://projecteuclid.org/euclid.ss/1009213726

Richard Border
  • 1,128
  • 9
  • 26
6

Personally, I like the following quote from Stephen Senn in Dicing with death. Chance, Risk and Health (Cambridge University Press, 2003). I highlighted one sentence (or two) that, I believe, summarize his main point, although the whole paragraph is worth reading.

Statistics are and statistics is.
Statistics, singular, contrary to the popular perception, is not really about facts; it is about how we know, or suspect, or believe, that something is a fact. Because knowing about things involves counting and measuring them, then, it is true, that statistics plural are part of the concern of statistics singular, which is the science of quantitative reasoning. This science has much more in common with philosophy (in particular epistemology) than it does with accounting. Statisticians are applied philosophers. Philosophers argue how many angels can dance on the head of a needle; statisticians count them. Or rather, count how many can probably dance. Probability is the heart of the matter, the heart of all matter if the quantum physicists can be believed. As far as the statistician is concerned this is true, whether the world is strictly deterministic as Einstein believed or whether there is a residual ineluctable indeterminacy. We can predict nothing with certainty but we can predict how uncertain our predictions will be, on average that is. Statistics is the science that tells us how.

chl
  • 50,972
  • 18
  • 205
  • 364
5

Statistics is the science of learning from data and measuring, controlling, and communicating uncertainty.

Marie Davidian & Thomas Louis

They continue:

; and it thereby provides the navigation essential for controlling the course of scientific and societal advances

whuber
  • 281,159
  • 54
  • 637
  • 1,101
Momo
  • 8,839
  • 3
  • 46
  • 59
  • I like this definition because it singles out the "uncertainty" aspect. The second part is nice because it says that statistics does not exist only by itself, but has to be seen in a broader context. To be completely satisfied however, I would perhaps merge that with the ASA one to: – Momo Mar 06 '15 at 17:45
  • 1
    Statistics as the science of learning from data and measuring, controlling, and communicating uncertainty provides the reasoning and methods for producing and understanding data. – Momo Mar 06 '15 at 17:46
  • Could you provide a citation? Your link is broken. – whuber Apr 14 '20 at 16:03
2

Statistics is a kitbag of methods and modes of thought that help people to make clear conclusions from noisy information.

Michael Lew
  • 10,995
  • 2
  • 29
  • 47
2

Because we are not a godlike all-knowing creature we have to deal with uncertainty and Statistics provides methods to incorporate and reflect that uncertainty.

elevendollar
  • 419
  • 3
  • 12
2

statistics is a sub-field of philosophy that deals with the following question 'how we learn from observations' using rigorous mathematical concepts.

just a side note you can make 'one sentence' very long, there is a book written by B. Hrabal that consist of one long sentence, see: Dancing Lessons for the Advanced in Age

pes
  • 169
  • 5
2

Statistics is both the science of uncertainty and the technology of extracting information from data

David J. Hand

Momo
  • 8,839
  • 3
  • 46
  • 59
2

Statistics is a set of logical principles and mathematical methods for summarizing quantified information in accurate, relevant ways.

SQLServerSteve
  • 1,121
  • 1
  • 13
  • 34
1

In my own words

Statistics is the science of what might be

This is sort of tongue-in-cheek.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • 1
    If you were to mask the first word and ask people to fill in the blank, I suspect "statistics" would not be the first thing they come up with--and perhaps not the second or third, either. "Futurology," "speculation," "science fiction," and maybe--getting a little closer to your intent--"prediction" and "forecasting"--would likely be popular choices. Even "oneirology" and "apotropaism" would be possibilities. :-) – whuber Mar 06 '15 at 23:29
1

Fisher (1922) gave his view on the essence of statistics in the following quote (bold font added by me for the one sentence requirement):

In order to arrive at a distinct formulation of statistical problems, it is necessary to define the task which the statistician sets himself: briefly, and in its most concrete form, the object of statistical methods is the reduction of data. A quantity of data, which usually by its mere bulk is incapable of entering the mind, is to be replaced by relatively few quantities which shall adequately represent the whole, or which, in other words, shall contain as much as possible, ideally the whole, of the relevant information contained in the original data.

ekvall
  • 4,361
  • 1
  • 15
  • 37
0

A results-oriented (and so not really descriptive) one-liner would be, for me,

Statistics is what makes the human world go round, irrespective of what is that does the same for Nature.

Alecos Papadopoulos
  • 52,923
  • 5
  • 131
  • 241
  • 4
    Are you confusing statistics with politics? Or maybe with love? – whuber Mar 07 '15 at 18:35
  • @whuber (+1) No. Both make most of their decisions based on Statistics, whether they realize it or not. – Alecos Papadopoulos Mar 07 '15 at 18:39
  • 3
    I can see it now, in an upcoming movie, when the male lead gets on his knees to propose: "Baby, you're my UMVUE, will you marry me?" :-) (Let's use a shrinkage estimator and bring our coefficients together...) – whuber Mar 07 '15 at 18:41
  • @whuber (+2) ...this is the "don't realize it" part: this is exactly what the male lead _means_, even though he does not use the language! (I concede that I may be guilty of philosophical imperialism here). – Alecos Papadopoulos Mar 07 '15 at 18:44
  • 2
    Your deeply respectable cultural background (insofar as your name and location allow one to infer it), which one can trace back at least to the early Sophists, allows you quite a bit of latitude in that regard. :-) – whuber Mar 07 '15 at 18:46
  • @whuber (+3) ...and it would appear that I am shamelessly exploiting it. Sophists, ah... aren't they the null hypothesis hardest to reject? – Alecos Papadopoulos Mar 07 '15 at 18:58
0

Statistics is a tool for modeling the generation of data by uncertain and/or probabilistic processes.

thecity2
  • 1,485
  • 2
  • 15
  • 22
0

A name for functions of the results of observations.

from here.

This is the meaning closest to that of OP. He's not talking about the branch of science. He means statistics such as mean and median. These are the functions on data

Aksakal
  • 55,939
  • 5
  • 90
  • 176
  • I beg to disagree: in English, the phrase "all of statistics" clearly means the *discipline* of statistics, not some collection of measurable functions or procedures. Moreover, the OP has had years now to dispute the uniformly consistent interpretation of the many answers in this thread, but has not (and was active during much of that period). – whuber Jun 22 '20 at 18:01
  • @whuber "all of these procedures and values, and indeed all of statistics" - OP just lumped "all" with what he started with as "procedures and values." – Aksakal Jun 22 '20 at 18:10
  • Yes, they did: but in light of the history of this thread, your interpretation of that is not terribly interesting or useful. – whuber Jun 22 '20 at 18:13
-1

Statistics is about torturing data long enough until it confess anything you want to show.

I am paraphrasing Ronald Coase, see link

Vladislavs Dovgalecs
  • 2,315
  • 15
  • 18
  • -1, was this intended as tongue in cheek? – gung - Reinstate Monica Mar 05 '15 at 21:58
  • @gung yes and no, I was quoting Ronald Coase. – Vladislavs Dovgalecs Mar 05 '15 at 21:59
  • 4
    Based on the version [here](http://stats.stackexchange.com/a/2044/7290), it is at best a bad paraphrase. That isn't a good 1-sentence summary of what statistics is. – gung - Reinstate Monica Mar 05 '15 at 22:19
  • 3
    @gung well, the OP asked how different people would describe it. It will always be his or her point of view or opinion. It will be different for different people. OP tried to gather different opinions IMHO. – Vladislavs Dovgalecs Mar 05 '15 at 22:46
  • 2
    xeon it would be a great kindness to Coase to edit your answer to properly cite and source the attribution. – Alexis Mar 06 '15 at 00:09
  • [As you know,](http://stats.stackexchange.com/help/dont-ask) xeon, we would close any thread that attempted to "gather different opinions." Much more has to be read into the question, including the implicit understanding that answers be authoritative, objective, and useful. Although this one is authoritative, many will view it as being a useless caricature (and some have voiced those assessments in comments). – whuber Mar 06 '15 at 23:23
  • I don't think references to torture (even indirect, allusive or metaphorical references) are ever a good basis for statements of this kind. Otherwise put, this is intended to be witty, and I get the joke too, but it's too black as well. – Nick Cox Nov 26 '15 at 14:50
  • @NickCox All respect but this is only your opinion. – Vladislavs Dovgalecs Nov 26 '15 at 16:47
  • And conversely. If you want to say that torture is amusing, I withdraw from discussion. – Nick Cox Nov 26 '15 at 16:51
  • @NickCox I said nothing. I only cited a well known person. As many people, as many opinions. There are things that don't please me as well but as you said - "that is an opinion conversely". This is not the place for opinions. – Vladislavs Dovgalecs Nov 26 '15 at 16:58
  • It's a tacit opinion of yours that the quotation is suitable content for the forum. Whether someone else said it is immaterial; that's what you're implying. I am not down-voting the answer, nor am I flagging it for moderators, but I think I am within forum guidelines in dissenting from your opinion as a matter of taste. No more, but no less. – Nick Cox Nov 26 '15 at 17:03
  • @NickCox I fully agree with you here. Thanks for monitoring the forum, I appreciate that. I learned a lot from the posts here. – Vladislavs Dovgalecs Nov 26 '15 at 17:04
-2

Statistics is the mathematical science that allows you to figure out if the difference between sets of observations are just random or not.

Sympa
  • 6,862
  • 3
  • 30
  • 56
  • 1
    Describes a narrow subset of what the field is. – rolando2 Mar 05 '15 at 23:43
  • I see it differently. Ultimately, whether you are conducting hypothesis testing, regression modeling, or any other estimation you most always measure whether the difference between your estimate vs a naïve model, or difference in observations are statistically significant or not. My sentence captures the essence of statistical significance vs. randomness. If others agree, can you give me some up votes, so my comment that is easily justifiable is not treated as a plain wrong answer just because of one individual's subjective interpretation of narrowness. – Sympa Mar 06 '15 at 00:03
  • 2
    please consider these types of questions that one often seeks to answer using statistics: What is the shape of this distribution? What is the nature of the relationship between these 2 variables? How can these many variables be grouped so that we can see the common issues/themes/topics/dimensions? How can these many cases be grouped so that we can see the common types/profiles? What is the best way to describe this web of relationships with an eye toward causality? What captures the trend of this variable over time? What is the best way to forecast future values? – rolando2 Mar 06 '15 at 00:11
  • In each of those cases, the answer to those questions has a strong element of statistical significance and whether what you are looking at in any shape or form is different vs. what could occur by sheer randomness. To most of us a negative vote means an explicitly wrong answer. I don't see how my answer could be categorized as such. – Sympa Mar 06 '15 at 21:42
  • 1
    The hover text over the downvote arrow states "this answer is not useful." I find it interesting--and therefore not unuseful--because it is thought-provoking, but I have not upvoted it for several reasons. The first is the assertion that stats is a "mathematical science": that comes uncomfortably close to the misconception (especially among certain mathematicians) that stats is *just* a branch of mathematics. The second is that it seems only to characterize two-sample hypothesis testing, which is a very narrow (albeit pervasive) part of statistics. – whuber Mar 06 '15 at 23:19
  • An observer visiting this site would derive that my answer is the worst one... even worse than someone who made a joke about the question. The very best answers invariably just state statistics are just methods to better understand data. However, the latter do not tell you much if anything about the field of statistics. That's especially true if you were to address such answers to outsiders. On the other hand, my answer states something precise and descriptive about the field that is readily comprehensible by outsiders. That should count for something. – Sympa Mar 08 '15 at 00:16