How are statistics on scientific papers inferred?

Question

Referring to this question: How should be statistics on scientific papers read?, a kind user explained us how to read the following statement from the original research paper:.

Increasing carbohydrate intake was associated with increasing stroke risk (HR = 2.01, 95%CI = 1.04–3.86 highest vs. lowest quintile; p for trend 0.025).

Multivariable Cox modeling estimated adjusted hazard ratios (HRs) of stroke with 95% confidence intervals (95%CI).

In the answer I was told to interpret this as:

The confidence interval suggests that we can conclude, with 95% certainty, that the true hazard rate in the population could fall anywhere between 1.04 and 3.86. In the broader population, the stroke risk associated with increased carbohydrate consumption could be as high as 3.86 times or as low as 1.04 times that of the comparison group.

I don't understand what part of the original text gave this information on the confidence interval.

With data we usually mean "whatever we have observed", say we have a survey than a table with the on the first row and first colum the answer of the first respondent to the first question, and on the first row second column the answer of the first respondent to the second question is the data. I don't think that that is what you meant when you use the word data. However, I don't understand what you do want to know. — Maarten Buis, Sep 25 '14 at 09:46
Based on the title this question sounds very broad (unanswerable as one cannot explain all possible ways statistics can be used in scientific papers in a single answer). If you mean to ask, for example, 'how this confidence interval was obtained', the question should be edited to reflect that. — Juho Kokkala, Sep 25 '14 at 09:50
@JuhoKokkala: yes, I refer only to the specific example I provided. — Revious, Sep 25 '14 at 11:40
@MaartenBuis: for me it's hard to understand that looking at some cases I can say: "The confidence interval suggests that we can conclude, with 95% certainty, that the true hazard rate in the population could fall anywhere between 1.04 and 3.86.". Can you help us with an explanation of this sentence and how it was inferred in the specific example? (if it's possible obviously) — Revious, Sep 25 '14 at 11:43
That sentence is incorrect, though it is a common error. A more correct interpretation of the confidence interval is that if we were to repeatedly sample from the population and compute the hazard ratio in each of these samples, than we would expect 95% of these samples to produce a hazard ratio within that interval. You can read more about confidence intervals here: http://en.wikipedia.org/wiki/Confidence_interval . — Maarten Buis, Sep 25 '14 at 11:51
This interpretation was infered from the text using the part "95%CI = 1.04–3.86", This reads "the 95% confidence interval is [1.04 , 3.86]", which is interpreted as in my previous comment. — Maarten Buis, Sep 25 '14 at 11:54
@MaartenBuis: ok, but it's not completely clear to me why this data is important. Is it saying something like: "different groups of subject have a different correlation between stroke and eating sugar, but the 95% of the subgroups however have a correlation.. which means that eating sugar is not so good for stroke in any case". Does it means something similar? — Revious, Sep 25 '14 at 12:07
I have just substantially edited your question. I think that this is what you wanted to ask us, but finding the right terminology got in the way. Please correct me if this is not the question you wanted to ask. — Maarten Buis, Sep 25 '14 at 12:07
It is not data. Using the word data like this will cause unnecessary confusion on sites like this. — Maarten Buis, Sep 25 '14 at 12:10
The data used to estimate a model comes from a random sample. As a consequence your point estimates (in this case, hazard ratios) are uncertain: If you were to repeat the entire research project, you would get a different random sample and thus different point estimates. The confidence interval is a way of quantifying that uncertainty. — Maarten Buis, Sep 25 '14 at 12:13
@MaartenBuis: ok, I think I've understood.. it's a concept which is similar to variance but more sophisticated.. if the variance is 0 the confidence is 100%. The higher is the variance and the easier is to have different results just because we selected a different group. Also, if the interval is very broad at 95% it may imply that there could be a clustering based on other factors (i.e. having diabetes) which is going to differentiate a lot two subsets on the risk of stroke. Is it correct? — Revious, Sep 25 '14 at 13:14
I am not sure I understand your analogy. I am pretty sure you are mixing up some terminology. I suggest you find an introductory statistics book that discusses inference (statistical testing and confidence intervals). — Maarten Buis, Sep 25 '14 at 14:12

How are statistics on scientific papers inferred?

0 Answers0

Linked