2

I am using Firth logistic regression to analyze data with a rare event. In my model I have 4 continuous variables and 1 dichotomous variable. This is my code:

library(logistf)
full1F <- logistf(Stroke~log(v1)+sqrt(v2)+log(v3)+log(v4)+dich_var)
summary(full1F)
exp(cbind(OR=coef(full1F),confint(full1F)))
  1. What statistic should I report to describe model fit (i.e. akin to AIC for GLM models)?

  2. How should I interpret the p-values (i.e. sqrt(v2) is significant based on the CI, but the p-value is 1.0)?

  3. Why is the confidence interval for the dichotomous variable so wide? This is the output:

                   OR L95%      U95%
    (Intercept) 28.67 5.06    162.5
    log(v1)      0.88 0.80      0.97
    sqrt(v2)     1.51 1.37      1.69
    log(v3)      1.36 1.13      1.64
    log(v4)      0.62 0.50      0.76
    dich_var    62.76 0.09 167702.4
    

Full output from logist and extractAIC:

Model fitted by Penalized ML 
Confidence intervals and p-values by Profile Likelihood 

             coef       se(coef)   lower 0.95 upper 0.95  Chisq      p
(Intercept)  4.12807056 2.7584264  2.6061200  5.671118391 1.479713 0.2238194
log(v1)     -0.08109114 0.1625829 -0.1744268  0.006705001 0.000000 1.0000000
sqrt(v2)     0.39223967 0.1831892  0.2963022  0.501110156 0.000000 1.0000000
log(v3)      0.31123164 0.3336092  0.1309848  0.502245304 0.000000 1.0000000
log(v4)     -0.53354748 0.3718985 -0.7448874 -0.331731496 0.000000 1.0000000
dich_varYes  4.14502663 3.9598211 -3.1206739 12.035651631    Inf   0.0000000

Likelihood ratio test=5.612847 on 5 df, p=0.3457304, n=1714
Wald test = 10.01195 on 5 df, p = 0.07489748

Covariance-Matrix:
            [,1]         [,2]          [,3]         [,4]          [,5]
[1,]  7.60891623  0.355883900 -0.2358586207  0.107347858 -0.9014000169
[2,]  0.35588390  0.026433207 -0.0139567792  0.006657177 -0.0352398354
[3,] -0.23585862 -0.013956779  0.0335582829 -0.012769583  0.0002785645
[4,]  0.10734786  0.006657177 -0.0127695828  0.111295107 -0.0034625698
[5,] -0.90140002 -0.035239835  0.0002785645 -0.003462570  0.1383084900
[6,]  0.07854902  0.004844901 -0.0121205581 -0.007438953 -0.0037122993
             [,6]
[1,]  0.078549017
[2,]  0.004844901
[3,] -0.012120558
[4,] -0.007438953
[5,] -0.003712299
[6,] 15.680183083

extractAIC(full1F)
[1] 5.000000 4.387153
kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
user80121
  • 21
  • 1
  • 3
  • 2
    A little hard to say without more information (can you provide a reproducible example??), but you might want to look at profile confidence intervals (`pl=TRUE` in your `logistf(...)` call) rather than Wald intervals ... the fact that `logistf` provides an `extractAIC` method suggests that it would be OK to report the AIC ... – Ben Bolker Jun 18 '15 at 22:35
  • You might also consider the `brglm` implementation which has the output formatted in the same way as `glm`. It can also be interpreted in the same way (including information criteria) as the estimator is rather close to the maximum likelihood estimator - just adding some bias reduction. – Achim Zeileis Jun 19 '15 at 00:45
  • I added more details of the output I get. I generated confidence intervals by Profile Likelihood. The chisq for the dichotomous variable is infinity. I am thinking that is the reason for the wide CI. Is there any way to get around that? I could also use help with interpreting the p-values of 0 or 1 and the two numbers generated for the AIC (5.0 and 4.39). – user80121 Jun 19 '15 at 05:23

0 Answers0