56
  • Statement One (S1): "One in 80 deaths is caused by a car accident."
  • Statement Two (S2): "One in 80 people dies as a result of a car accident."

Now, I personally don't see very much difference at all between these two statements. When writing, I would consider them interchangeable to a lay audience. However, I've been challenged on this by two people now, and am looking for some additional perspective.

My default interpretation of S2 is, "Of 80 people drawn uniformly at random from the population of humans, we would expect one of them to die as a result of a car accident"- and I do consider this qualified statement equivalent to S1.

My questions are as follows:

  • Q1) Is my default interpretation indeed equivalent to Statement One?

  • Q2) Is unusual or reckless for this to be my default interpretation?

  • Q3) If you do think S1 and S2 different, such that to state the second when one means the first is misleading/incorrect, could you please provide a fully-qualified revision of S2 that is equivalent?

Let's put aside the obvious quibble that S1 does not specifically refer to human deaths and assume that that is understood in context. Let us also put aside any discussion of the veracity of the claim itself: it is meant to be illustrative.

As best I can tell, the disagreements I've heard so far seem to center around defaulting to different interpretations of the first and second statement.

For the first, my challengers seem to interpret it as as 1/80 * num_deaths = number of deaths caused by car accidents, but for some reason, default to a different interpretation of the second along the lines of, "if you have any set of 80 people, one of them will die in a car accident" (which is obviously not an equivalent claim). I would think that given their interpretation of S1, their default for S2 would be to read it as (1/80 * num_dead_people = number of people who died in a car accident == number of deaths caused by car accident). I'm not sure why the discrepancy in interpretation (their default for S2 is a much stronger assumption), or if they have some innate statistical sense that I'm in fact lacking.

Sextus Empiricus
  • 43,080
  • 1
  • 72
  • 161
faulty_ram_sticks
  • 671
  • 1
  • 5
  • 8
  • 19
    "if you have any set of 80 people, one of them will die in a car accident" -- if that's how they understand that statement, then I've got an extremely old joke about 1 in 3 children being Chinese that they're absolutely going to *love*. – Steve Jessop Jan 22 '19 at 16:55
  • 3
    This is very much similar to [the difference between prevalence and incidence](https://en.m.wikipedia.org/wiki/Incidence_(epidemiology)#Incidence_vs._prevalence). As others noted "is caused" relates to some finished state, and "dies" relates to the present or future. – Sextus Empiricus Jan 22 '19 at 18:32
  • And since we’re splitting hairs, you could have a heart attack while **in** a car accident that injures no one. – WGroleau Jan 22 '19 at 18:33
  • First statement is a statistic. 2nd statement is a news headline of something that occurred. – CrossRoads Jan 22 '19 at 20:17
  • 5
    S2 does not make it clear that the other 79 died. Some or all of the other 79 could be alive. S1 says " 1 out of 80 deaths" which makes it clear that all 80in the group died. – Michael R. Chernick Jan 23 '19 at 06:30
  • 5
    As a lay person, without knowing context, I would interpret S1 as: "out of all causes of death for population X, 1/80 is due to car accidents", whereas S2 reads to me: "out of all people who are involved in car accidents, 1 in 80 dies due to it". – Gnudiff Jan 23 '19 at 08:03
  • 1
    Some people will die 80 years from now, when the statistics will probably be very different. – RemcoGerlich Jan 23 '19 at 12:24
  • 2
    I agree that your interpretation of the second statement is consistent with the first statement. However, I also agree that the 2 people claiming they mean different things is also consistent. That's the problem. Since you are the person attempting to communicate their thoughts; then the onus is on you to express your thoughts clearly enough to convey your meaning; not the listener. Thus, regardless of whether you think 'you are correct'; the bottom line is 'you are wrong' because your choice of words is not conveying the meaning you intended clearly enough to not be misunderstood. – Dunk Jan 23 '19 at 20:32
  • 2
    I’m going pretty far outside the realms of how a reasonable person might interpret those phrases, but to me: a person who, upon learning that a loved one has been in a car accident, suffers a heart attack and dies might be said to have died “as a result” of said accident but without their death “being caused by” said accident. – eggyal Jan 24 '19 at 13:32
  • 5
    (S2): "One in 80 people dies as a result of a car accident." seems ambiguous to me. Is that one in 80 people *who are in car accidents*, or one out of every 80 people? – Bill the Lizard Jan 24 '19 at 14:58
  • 2
    I think it might be the same if the assumption, that every person dies holds. Not sure about that – TinglTanglBob Jan 24 '19 at 17:30
  • S2 makes me ask, "What do the other 79 people do as the result of a car accident?". My wife rarely supports this sort of response. – GargantuChet Jan 24 '19 at 20:34
  • @eggyal I'm hoping that isn't too far outside of a reasonable interpretation: That is actually the ambiguity that jumped out at me in the interpretations in question and some of the comments, not the difference between 1 in 80 deaths vs 1 in 80 people. "dies in a car accident." (not days later) is not the same as "dies as a result of car accident" (could be like your example) and is not the same as "dies as a result of injuries sustained in a car accident" (could be days later, but was actually in the accident.) – Mr.Mindor Jan 25 '19 at 19:00
  • 1
    **tangential:** https://ourworldindata.org/grapher/share-of-deaths-by-cause-2016 - road incidents: 2.45% ... thats about double then 1/80 = 1,25% – Patrick Artner Jan 26 '19 at 12:48
  • I'd certainly parse them differently: the first, to me, means that *at present* (or rather, in some small recent time interval, like a year), 1/80th of the deaths that occurred were due to car accidents. The second, to me, means that 1/80th of the people alive today will die in a car accident. That does not follow from the former: for example, a major war starting in ten years time, or improvements in road safety could reduce that figure, while improvements in medical care or a plague of major traffic disasters could increase it. – user3482749 Jan 26 '19 at 18:10

9 Answers9

81

To me "1 in 80 deaths..." is by far the clearer statement. The denominator in your "1 in 80" is the set of all death events and that statement makes it explicit.

There's ambiguity in the "1 in 80 people..." formulation. You really mean "1 in 80 people who dies..." but the statement can just as easily be interpreted as "1 in 80 people now alive..." or similar.

I'm all for being explicit about the reference set in probability or frequency assertions like this. If you're talking about the proportion of deaths, then say "deaths" not "people".

Brent Hutto
  • 1,038
  • 5
  • 12
  • 31
    ""1 in 80 people who dies..." - given that there's no such thing as immortality, we can safely assume that the set of people who will die is the same set as all people. You'd need an additional qualifier e.g. "people who will die _next year_" – MSalters Jan 22 '19 at 18:03
  • 17
    @Msalters The statements are in the present tense, so they assert a claim about the present rate. – Acccumulation Jan 22 '19 at 19:41
  • 6
    Technically, I think "1 in 80 people" is more specific, because there are many more deaths than people deaths: deaths of birds, deaths of bacteria, deaths of pedantry, ... – Kimball Jan 22 '19 at 21:10
  • 1
    @Acccumulation: Present tense is also used for broader statements that are universally correct and do not exclusively focus on the present. If I tell you the sun is a star, just because I used present tense doesn't mean that I was implying it's _only currently_ a star. Similary, while the 1/80 ratio does focus on the present time in particular, the fact that all people die is not a temporary event. It is a matter of fact that is unversally correct and no timeframe is explicitly quantified for it. – Flater Jan 23 '19 at 06:54
  • 1
    @Flater - You have a point, but that 1/80 ratio is not universally correct. In particular, we may expect (or hope) that it will have changed by the time of death of most of the people that are alive nowadays. – Pere Jan 23 '19 at 10:34
  • 2
    @Pere: I don't quite, follow, you're saying the same thing from when I said `the 1/80 ratio does focus on the present time in particular`. Or am I misunderstanding? – Flater Jan 23 '19 at 10:43
  • @Flater - Or maybe I misunderstood you. – Pere Jan 23 '19 at 10:51
  • @Flater "If I tell you the sun is a star, just because I used present tense doesn't mean that I was implying it's only currently a star." The question isn't whether "1 in 80 people..." implies that the rate was different at other times, the question is whether it says that it was the same. Denying that X implies Y is not the same as saying that X implies not Y. – Acccumulation Jan 23 '19 at 16:00
  • Furthermore, context matters. We know that objects don't generally go from being stars to not being stars. But we know that death rates do often change. And the very fact that it is specified "of people who die" supports the inference that is about present rates. Since everyone dies, this is clearly not restricting the set to people who die in general, it's restricting the set to people who die in present tense. – Acccumulation Jan 23 '19 at 16:00
  • 1
    @Acccumulation: It's a nonsensical distinction. Even if it's ambiguously phrased, which I don't quite agree with, you're effectively arguing that somehow the more likely option is that we know how the people that are currently alive are going to die at some point in the future, as opposed to interpreting it as a present day statistic. – Flater Jan 23 '19 at 16:29
  • 3
    (On the note of possible ambiguity) From an English language perspective; I think a more common misunderstanding (and more serious) would be to read S2 to mean "1 in 80 people who have a car accident, will die in that accident" (that is, "1 in 80 car accidents are fatal") - which is a very different claim. – Bilkokuya Jan 23 '19 at 16:58
67

First of all, my first intuitive thought was: "S2 can only be the same as S1 if the traffic death rate stays constant, possibly over decades" - which certainly wouldn't have been a good assumption in the last so many decades. This already hints that one difficulty lies with implicit/unspoken temporal assumptions.

I'd say your statements have the form

1 in $x$ $population$ experience $event$.

In S1, the population are deaths, and the implied temporal specification is at present or "in a suitably large [to have sufficent case numbers] but not too wide time frame [to have approximately constant car accident characteristics] around the present"

In S2, the population are people. And others seem to read this not as "dying people" but as "living people" (which after all, is what people more frequently/longer do). If you read the population as living people, clearly, not one of every 80 people living now dies "now" of a car accident. So that is read as "when they are dying [possibly decades from now], the cause of death is car accident".

Take home message: always be careful to spell out who your population are and the denominator of fractions in general. (Gerd Gigerenzer has papers about not spelling out the denominator being a major cause of confusion, particularly in statistics and risk communication).

cbeleites unhappy with SX
  • 34,156
  • 3
  • 67
  • 133
  • 1
    "In S1, the population are deaths, and the implied temporal specification is at present or "in a suitably large [to have sufficent case numbers] but not too wide time frame [to have approximately constant car accident characteristics] around the present"- When considering all of the (wonderful) answers I've received, I think this cuts to the heart of the matter the most. I've been doubly surprised at the huge variety of ways people can the second statement, and many have opened my eyes to those interpretations, but I this is the statistical specification my original statement lacked. – faulty_ram_sticks Jan 25 '19 at 17:28
  • faulty_ram_sticks: Thanks for the flowers :-) And these two possible population specifications I made from S2 are by far not the only ones, e.g. you could specify along the lines of @PeterShor's answer a population of people born at some time frame and follow that one (longitudinally). And so on. – cbeleites unhappy with SX Jan 25 '19 at 17:34
  • @cbeleites flowers? what flowers? :-/ – user64742 Jan 27 '19 at 22:40
  • 1
    Might I also point out that the usage of "deaths *is* caused" and "people *dies*" is confusing/bad wording as well.I feel like when describing a statistic it should be "deaths were caused" and "people have died". That makes it unambiguous that you are describing a past event rather than making a prediction. Of course one could also say that "people will die" to say that the statistic provides credence towards a prediction that a certain percentage of the current population will die from car accidents. I give these examples here for future reference as they are not an answer in their own right. – user64742 Jan 27 '19 at 22:49
  • But to say a percentage of deaths *is* caused confuses me because that describes the present as if the event is occurring right now or something. If I saw that, I would consider it poor wording, but probably not read much into it. However, since this is a discussion of wording, I felt I should bring it up. – user64742 Jan 27 '19 at 22:51
43

It depends on whether you are describing or predicting.

"1 in 80 people will die in a car accident" is a prediction. Of all the people alive today, some time within their remaining lifetime, one in 80 will die that way.

"1 in 80 deaths are caused by a car accident" is a description. Of all the people who died in a given period (e.g. the time span of a supporting study), 1 in 80 of them did indeed die in a car accident.

Note that the time window here is ambiguous. One sentence implies that the deaths have already occurred; the other implies they will occur some day. One sentence implies that your baseline population is people who have died (and who were alive before that); the other implies a baseline population of people who are alive today (and will die eventually).

These are actually different statements entirely, and only one of them is probably supported by your source data.

On a side note, the ambiguity arises from a mismatch between the state of being a person (which happens continuously) and the event of dying (which happens at a point in time). Whenever you combine things in this way you get something that is similarly ambiguous. You can instantly resolve the ambiguity by using two events instead of one state and one event; for example, "Of each 80 people who are born, 1 dies in a car accident."

StatsStudent
  • 10,205
  • 4
  • 37
  • 68
John Wu
  • 531
  • 3
  • 2
  • 1
    Aside from being a prediction, it could bizarrely be interpreted as a threat. Reading this answer made me think of [the relevant xkcd](https://xkcd.com/190/). – Wildcard Jan 24 '19 at 22:38
  • 4
    "Today, one in 80 deaths is caused by a car accident. But given the rapid improvement in vehicle and road technology, and the gradual shift to other modes of transport, we expect this to drop to one in 120 by the year 2050, and one in 150 by 2100. Accordingly, of people alive today, only one in 135 will die in a car accident". – Michael Kay Jan 25 '19 at 18:40
21

The two statements are different because of sampling bias, because car accidents are more likely to occur when people are young.

Let's make this more concrete by positing an unrealistic scenario.

Consider the two statements:

  • One half of all deaths are caused by a car accident.
  • One half of all people alive today will die in a car accident.

We will show that these two statements are not the same.

Let's simplify things greatly and suppose that everybody born will either die of a heart attack at age 80 or a car accident at age 40. Further, let's suppose that the first statement above holds, and that we're in a steady state population, so deaths balance births. Then there will be three populations of humans, all equally large.

  • People under 40 who will die of a car accident.
  • People under 40 who will die of a heart attack.
  • People over 40 who will die of a heart attack.

These three populations have to be equally large, because the rate of people dying in car accidents (from the first population above) and the rate of people dying in heart attacks (from the third population above) are equal.

Why are they equal? The number of people who die in car accidents each year is $1/40$ of the number of people in the first population, and the number of people who die by heart attacks is $1/40$ of the number of people in the third population, so the two populations have to have equal size. Further, the second population is the same size as the third (because the third population is the second, 40 years later).

So in this case, only one third of all people alive today will die in a car accident, so the two statements are not the same.

In real life, my impression is that car accidents occur at a significantly younger age than most other causes of death. If this is the case, there will be a substantial difference between the numbers in your statement one and two.

If you modified the second statement to

  • One half of all people born will die in a car accident,

then under the assumption of a steady state population, the two statements would be equivalent. But of course, in the real world we don't have a steady state population, and a similar (although more complicated) argument shows that for a growing, or shrinking, population, sampling bias still makes these two statements different.

Peter Shor
  • 414
  • 2
  • 7
  • "These three populations have to be equally large, because the rate of people dying in heart attacks (from the first population above) and the rate of people dying in heart attacks (from the third population above) are equal." Do you think you could make this a little bit more obvious? (Also, I presume you mean car accidents for the first population) Otherwise, this is a great example. I hadn't even considered that there could be a difference between "One half of all people born" and "One half of all people alive". – faulty_ram_sticks Jan 23 '19 at 14:18
  • @faulty_ram_sticks. Yes: sampling bias can be tricky. I've explained it in more detail ... I hope this is good enough now. And thanks for catching my typo. – Peter Shor Jan 23 '19 at 20:11
  • 5
    This is a wonderful sampling bias question. The fact that the first 8 answers didn't catch the sampling bias shows that this is really tricky. I may use it when I teach probability. – Peter Shor Jan 23 '19 at 20:25
  • Very well constructed, clear argument, and the one which not only theoretically invalidates the OP's intuition of equivalence in a way s/he probably didn't even think of. – Peter - Reinstate Monica Jan 24 '19 at 11:07
12

Is my default interpretation indeed equivalent to Statement One?

No.

Let's say we have 800 people. 400 died: 5 from a car crash, the other 395 forgot to breathe. S1 is now true: 5/400=1/80. S2 is false: 5/800!=1/80.

The problem is that technically S2 is ambiguous because it doesn't specify how many deaths there were in total, while S1 does. Alternately, S1 has one more piece of information (total deaths) and one less piece of information (total people). Taken at face value, they describe different ratios.

Is unusual or reckless for this to be my default interpretation?

I actually disagree with your interpretation, but I think it doesn't matter. Likely, context would make it obvious what is meant.

  • On the one hand, obviously all people die, thus it is implicit that total people = total deaths. So if you are discussing rates of death in general, your default interpretation applies.
  • On the other, if you are discussing a limited data set in which it is not a given that everybody dies, my interpretation above is more accurate. But it seems not hard for the reader to overlook this.

You might ask where you could possibly encounter people who don't die. For one, we could be working with a statistical dataset that only tracks people for 5 years, so the one ones still alive at the end of the study must be ignored, as it's not known what they will die from. Alternatively, the cause of death may be unknown, in which case you can't really assign it to cars or not cars.

If you do think S1 and S2 different, such that to state the second when one means the first is misleading/incorrect, could you please provide a fully-qualified revision of S2 that is equivalent?

"One in 80 people who die, does so as a result of a car accident." which amounts to rephrasing S1.

Cloons
  • 131
  • 3
  • This answer mimics the semantical argument made by Brent Hutto who speaks about an ambiguity because it is unclear when you say '1 in x people die as a result of y' whether one means mortality *rate* for the specific cause of death $y$ or the proportion of total deaths that will be due to cause $y$. Technically speaking the sentence should be like '1 in x people die, during period z, as a result of y' in order to be logically interpreted as referring to a mortality rate. So isn't the ambiguity more because people are not always interpreting logically? Yes, it is. – Sextus Empiricus Jan 22 '19 at 21:20
  • But that was just a sidenote. What I wanted to say is that this answer only notes a semantical issue but is missing the statistical issue which is about the situation that the probability for a death to be due to a car accident is not constant independent of time. This makes it different when we refer to 'is caused' (the past) and '(will) die' (the present or future). – Sextus Empiricus Jan 22 '19 at 21:23
  • "Cause of death" is a medicolegal term which takes a latent time to establish medically, to report, and collect. One can only state this in retrospect when the probability of missing a cause of dying is small, or otherwise one corrects for missing data, e.g., by including late reporting from the prior reporting time period. So, a statistically proper statement sounds something like, "In *geographic data collection region* during *time period* the odds of death attributable to car accidents *was* estimated to be one in 80 deaths." – Carl Jan 23 '19 at 00:08
7

I would agree that your interpretation of the second statement is consistent with the first statement. I would also agree that it's a perfectly reasonable interpretation of the second statement. That being said, the second statement is much more ambiguous.

The second statement can also be interpreted as:

  • Given a sample of individuals in a recent car accident, 1/80 died.
  • Given a population sample at large, 1/80 will die because of factors related to a car accident, some of them being the accidents themselves, but some others being suicide, injuries, medical malpractice, vigilante justice, etc.
  • Extrapolating current safety trends indicates that 1/80 people alive today will die because of a car accident.

The second and third interpretations above might be close enough for lay audiences, but the first one is pretty substantially different.

Alex H.
  • 171
  • 2
  • 2
    The first interpretation '1/80 of all victims in a car accident die because of the accident' is how I interpreted the second statement at first. Although I do not think that it is a correct interpretation and is more like resulting from skim reading. – Sextus Empiricus Jan 22 '19 at 20:52
5

The basic difference is that the two statements refer to different populations of humans, and different time frames.

"One in 80 deaths is caused by a car accident" presumably refers to the proportion of deaths in some fairly limited time period (say one year). Since the proportion of the total population using cars, and the safety record of the cars, have both changed significantly over time, the statement doesn't make any sense unless you state what time interval it refers to. (As a ridiculous example, it would clearly have been completely wrong for the year 1919, considering the level of car ownership and use in the total population at that time). Note, the "proportion of the total population using cars" in the above is actually a mistake - it should be "the proportion of people who will die in the near future using cars" and that is going to be skewed by the fact that young and old people have different probabilities of dying from non-accident-related causes, and also have different amounts of car use.

"One in 80 people dies as a result of a car accident" presumably refers to all humans who are currently alive in some region, and their eventual cause of death at some unknown future time. Since the prevalence and safety of car travel will almost certainly change within their lifetimes (say within the next 100 years, for today's new-born infants) this is a very different statement from the first one.

alephzero
  • 281
  • 1
  • 2
3

A1) Assuming everyone dies, and assuming the context of a sufficiently small period of time around that which the measurements were taken, yes, your interpretation of S2 matches S1.

A2) Yes, your interpretation of S2 is reckless. S2 can be interpreted as "1 in 80 people involved in car accidents die" which is obviously not equivalent to S1. Therefore using S2 could cause confusion.

Your interpretation of 1 in 80 is reasonable, though, and the other interpretation (1 in any 80) is very unusual. "1 in N of U is P" is a very common shorthand for "given a predicate, P, and N random samples, x, from universe U, the expected number of samples such that P(x) is true approximately equals 1".

A3) Out if all people, 1 in 80 dies as a result of a car accident.

Vaelus
  • 143
  • 2
  • You do not explain your answer to A1. Yet, there have been some answers addressing other issues (saying no to A1). Could you explain why those are not right? – Sextus Empiricus Jan 22 '19 at 21:29
  • @MartijnWeterings The only answer saying no to A1 seems to disagree on the grounds that over some finite period of time, not everyone nessesarily dies. Do you think my edit addresses this? – Vaelus Jan 23 '19 at 00:41
  • I think the sentence that doesn't indicate that the other 79 died from other causes is therefore ambiguous. So it is best to state the other. – Michael R. Chernick Jan 23 '19 at 01:03
  • Vaelus, there is another reason to say 'no' to A1. That reason is that the fraction of deaths that are due to car accidents may not need to be constant in time, thus it is ambiguous, and S1 and S2 have a different perspective regarding this (not constant) expression. – Sextus Empiricus Jan 23 '19 at 12:34
-1

Yes, it is wrong, and neither phrasing seems sufficient to consistently convey your desired meaning

Speaking as a layperson, if your target is laypeople, I would definitely recommend posting over at https://english.stackexchange.com/, rather than here - your question took me a few reads to unentangle what S1 & S2 intuitively mean to me vs. what you meant to say.

For the record, my interpretations of each statement:

  • (S1) - per 80 deaths, 1 death by car accident

  • (S2) - per 80 people in a car accident, 1 death

To convey your meaning, I would likely use a modified S2: "One in 80 people will die in a car accident."

This still contains some ambiguity, but keeps a similar brevity.

ap55
  • 9
  • 1