3

70% in the media

https://www.dailymail.co.uk/news/article-9070683

Prime Minister Boris Johnson told a Downing Street briefing that early analysis showed the new strain could increase the reproductive rate by 0.4 or more and that it may be up to 70% more transmissible than the old variant.

Recently the news has been covered with articles about a new strain of ncov-19 that is more contagious.

It is said that it is "70% more transmissible".

Where does this statistic come from? How has it been derived/estimated (from what raw data) and what does it mean statistically (accuracy, certainty)?


Background

The figure of 70% originates from a presentation by Dr. Erik Volz from Emperical College London. See the COG-UK showcase event on youtube

2:36:41 – 2:46:50 Epidemiology of SARS-CoV-2 Genetic Variants. Dr Erik Volz, Imperial College London

modeling selection advantage

  • What does more 'infectious' really mean? This model relates to a parameter called the selective advantage. In the SIR model paradigm this selective advantage is a ratio of reproduction rates, as explained in Gordo et.al. "Genetic Diversity in the SIR Model of Pathogen Evolution" PLoS ONE 4(3): e4876. (Of course, selective advantage can also occur differently in many different ways, for instance when the reproductive rate is the same or even lower but the incubation time is shorter)

  • On what measurements is it based? The raw data are measurements of occurances of the strain, and based on that estimates are made for the relative presence of the strain, or also the odds ratio for the probability that a randomly selected infection is of the specific strain.

  • How is it estimated? Based on a time series of this odds ratio an estimate is made for the relative growth rate and the selective advantage. This 70% means: that every 7 days (probably the used serial or generation interval in the computations) the relative presence of the new strain grows by a factor $exp(0.7) \approx 2$

    In the presentation the used formula is $$\frac{d}{dt} \text{log}(OR) = \frac{s}{\text{generation time}}$$

    With $\frac{d}{dt} \text{log}(OR) \approx 0.1 \text{ days}^{-1}$ (in the graphs it seems like log odds changes 4 units in a bit more than 1 month) and probably they use $\text{generation time} \approx 7 \text{ days}$ you get $s = 0.7$

    But that is an indirect representation of infectious/transmissibility. It is the relative difference in growth rates, it is not the relative infectiousness (e.g. relative reproduction rates).

    • In the discrete case: Say you start on $t=1$ with $n_{mutated}(1)=1$ of out of $n_{total}(1)$ and $k$ reproduction cycles later you see what you got. Then it is like sampling $K$ times new cases $n_{mutated}(t)$ and $n_{other}(t)$ and the ratio's of these should relate to the relative growth rates. The used model and mathematical expression for selective advantage should assume the growth rates will be $\propto \text{ln}R$. The relative growth rate will be $s=\text{ln}R_1/R_2$ (This actually makes the picture of the situation worse. To get relative growth os $s = 0.7$ you need $R_1/R_2 = e^{0.7} \approx 2$ which is two times more transmission/reproduction rate.)

    • In the continuous case: The article by Gordo et.al. models a discrete time SIR model, and reality is not a discrete time SIR model. For a continuous model the relationship between the relative growth rate of strains and the relative ratio of reproduction rates will be different and instead of $\propto \text{ln}(R)$ the growth rates will be roughly more like $\propto R-1$. So the relative difference in growth rates will be $s = ({R_{mutant}-1})-({R_{other}-1}) = R_{mutant}-R_{other}$.

    In addition to the meaning of the statistic being unclear, we also have issues with randomness and bias due to the way of sampling. Small fluctuations locally and in time can make a particular strain grow fast while being unrelated to the selective advantage. The measurements of the presence of the strain may not need to come from the same population but they are sampled from a wider group (e.g. it is measured in a hospital or a village, but possibly the growth is large in a particular part of the hospital or a particular neighborhood, and this might correlate with the strain. The growth is turbulent and noisy, and when the time-scale for the analysis is short then there might be a large chance to pick up a temporal fluctuation)

    So the 70% figure that had been presented by Boris Johnson, which is the relative growth rate of the new virus strain, is not the relative infectiousness.

The 70% figure is more like a sidenote in the presentation. It was used to show that this strain grows relatively fast in the initial growth phase (and in other strains it is shown that this eventually decreases so this initial figure does not reflect the true relative infectiousness). It was not meant to be an estimate for the relative infectiousness.

Sextus Empiricus
  • 43,080
  • 1
  • 72
  • 161
  • 2
    It seems to be related to medicine & epidemiology rather than statistics. – Tim Dec 20 '20 at 13:06
  • @Tim this 70% is a **statistic** and the question is what that **statistic** means. It relates to the tag 'statistics-in-media'. Or is that tag off-topic? I believe that this question has much more to do with statistics, and how politicians and media use it, than with medicine and epidemiology. Sure, eventually there is an epidemiological background, but the fact is that this **number/statistic** is in the media and not the underlying epidemiological background. – Sextus Empiricus Dec 20 '20 at 13:54
  • In addition the 'where does the information come from?' is very much statistics related. You can have a sense about the underlying epidemiological model, and that is important to be able to think about these sort of figures, but beyond that the more interesting question is actually 'what is the basis for claiming a certain statistical figure?'. The '70 %' is being presented without any error bounds or information about the underlying data, and statistically speaking is not very carefully presented. – Sextus Empiricus Dec 20 '20 at 14:03
  • This seems like last Nordic winter all over again when it was claimed in the media that the virus was gonna kill 2% of the people (based on bad **statistics**). It is the job of the statistics community to provide the right nuance, the background, and question these figures when they are not presented well enough. – Sextus Empiricus Dec 20 '20 at 14:17
  • 3
    This might be a "statistic" only in a very general sense: I would understand this figure to be an estimate of a *model parameter* for some kind of transmission model. Thus, your question comes down to "which model is being referred to?" That's a matter of researching the source of the quotation -- as @Tim suggests, this is epidemiology more than statistics. – whuber Dec 20 '20 at 15:42
  • 1
    @whuber in my view it is like questions as these [How exactly is the "effectiveness" in the Moderna and Pfizer vaccine trials estimated?](https://stats.stackexchange.com/questions/496730) or [Origin of "5 σ σ " threshold for accepting evidence in particle physics?](https://stats.stackexchange.com/questions/31591). We have a statistic being used in the media, but it is unclear what it means. Not only from the domain point of view (epidemiology) but also statistically. – Sextus Empiricus Dec 20 '20 at 16:48
  • I have edited the question to focus on the statistical part. – Sextus Empiricus Dec 21 '20 at 06:27
  • 1
    Another question in the same style is: [What does 94.5% effective mean?](https://stats.stackexchange.com/questions/496635) – Sextus Empiricus Dec 21 '20 at 06:30
  • 3
    Good question, very relevant and shouldn't be labeled 'off-topic' – Arne Jonas Warnke Dec 21 '20 at 06:34
  • 1
    In this news item we see the number being criticized on *statistical* grounds https://www.dailymail.co.uk/news/article-9073765/Scientists-call-clarity-claim-new-Covid-19-variant-strain-70-contagious.html Several phrases relate to statistics, like *'every expert is saying it's too early to draw such an inference'* and *'the basis of the figures '* – Sextus Empiricus Dec 21 '20 at 09:01

0 Answers0