13

I'm working on a multiple logistic regression in R using glm. The predictor variables are continuous and categorical. An extract of the summary of the model shows the following:

Coefficients:
               Estimate Std. Error z value Pr(>|z|)
(Intercept)   2.451e+00  2.439e+00   1.005   0.3150
Age           5.747e-02  3.466e-02   1.658   0.0973 .
BMI          -7.750e-02  7.090e-02  -1.093   0.2743
...
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Confidence intervals:

                  2.5 %       97.5 %
(Intercept)  0.10969506 1.863217e+03
Age          0.99565783 1.142627e+00
BMI          0.80089276 1.064256e+00
...

Odd ratios:

                 Estimate Std. Error   z value Pr(>|z|)
(Intercept)  1.159642e+01  11.464683 2.7310435 1.370327
Age          1.059155e+00   1.035269 5.2491658 1.102195
B            9.254228e-01   1.073477 0.3351730 1.315670
...

The first output shows that $Age$ is significant. However, the confidence interval for $Age$ includes the value 1 and the odds ratio for $Age$ is very close to 1. What does the significant p-value from the first output mean? Is $Age$ a predictor of the outcome or not?

Zhubarb
  • 7,753
  • 2
  • 28
  • 44
SabreWolfy
  • 1,101
  • 2
  • 15
  • 25
  • 9
    It is only significant at the 10% confidence level, but the confidence intervals are 5%. – Nick Sabbe May 04 '11 at 15:08
  • So confidence intervals for 10% would not include 1 then? – SabreWolfy May 04 '11 at 18:06
  • The p-value (last column first table) is the chance that the obtained result or worse would be attained if the null hypothesis were true. The confidence interval is a/the region that will hold the true value in e.g. 95% of the times. If it does not hold the hypothesized true value, then there is at most 5% chance that we would get the obtained result or worse, if the hypothesis is true. So this would imply your p-value to be lower than 5%. There is a very close relation between p-values and confidence intervals (statistics 101). But in short: yes, the CI for 10% will include 1. – Nick Sabbe May 05 '11 at 06:51
  • It appears that you are assuming linearity. How is that justified? – Frank Harrell Oct 24 '13 at 14:04

1 Answers1

8

There are a host of questions here on the site that will help with the interpretation of the models output (here are three different examples, 1 2 3 , and I am sure there are more if you dig through the archive). Here is also a tutorial on the UCLA stats website on how to interpret the coefficients for logistic regression.

Although the odds-ratio for the age coefficient is close to one it does not necessarily mean the effect is small (whether an effect is small or large is frequently as much a normative question as it is an empirical one). One would need to know the typical variation in age between observations to make a more informed opinion.

Andy W
  • 15,245
  • 8
  • 69
  • 191
  • Thanks for the link to the tutorial, which looks comprehensive. I did search here before posting my question. Links 1 and 3 appear not to be related to my question. – SabreWolfy May 04 '11 at 18:04
  • @SabreWolfy, link 1 further elucidate how to interpret the coefficients in terms of the original units, link 3 describes the steps to interpret the effects in terms of probabilities (which is really applicable to your question, and the suggested plots in that question would be a reasonable response to me saying the size of the direct effect is difficult to interpret without knowing the variation in age). – Andy W May 04 '11 at 18:16
  • 5
    Assuming age is measured in years, then an odds ratio of 1.059 implies a difference in odds between a 20 year old and a 50 year old of $(1.059^{30} -1)\times 100\% = 458\%$. I would not call that a small effect. However, I implicitly assumed you were talking about humans. If instead these are mice then a 30 year span is not very helpful and you will need to change the evaluation of the size of the effect accordingly. – Maarten Buis Oct 24 '13 at 13:41
  • The UCLA link is dead, but [this one](https://stats.idre.ucla.edu/other/mult-pkg/faq/general/faq-how-do-i-interpret-odds-ratios-in-logistic-regression/) probably corresponds (at least its content help me understanding this question). – MBR Dec 04 '17 at 16:32