Why do the p-value and confidence intervals not always match in logistf?

Question

I have recently started using logistf in R and find that I am getting strange results. Namely, sometimes the confidence intervals for the coefficient cross 0, despite having a strongly significant p-value, and vice versa.

For example

outcome <- as.factor(c(0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1))
predictor <- c(0,5,5,5,5,5,5,5,5,5,5,5,1,0,0,0,0)
logistf(outcome ~ predictor)

In my output, I discover the coefficient for my predictor to be -0.4 (95% confidence interval -2 to 7). However, the p-value is 0.00000352.

Can anyone explain how I am using this penalised logistic regression method incorrectly as this result seems to be inconsistent?

Something fishy is that using `glm(..., family = binomial)` and `logistf(..., firth = FALSE)` give quite radically different coefficient estimates even though they're both using maximum likelihood to estimate the coefficients. Maybe this package has a bug in it. You can perform the same analysis using the `brglm2` package, which has results more in line with what you get using `glm()`. — Noah, Apr 06 '20 at 06:56
You can set pl=FALSE, so that you get wald c.i, I think the penalized loglikelihood estimation doesn't quite work in this case — StupidWolf, Apr 06 '20 at 08:04
I don't understand the meaning of your "however:" the p-value and confidence interval are perfectly consistent. Maybe you should review our posts on what p-values mean: start with https://stats.stackexchange.com/questions/31. — whuber, Apr 06 '20 at 12:49
Thank you all. @StupidWolf if you input your response as an answer I'll accept it. — Tim K, Apr 07 '20 at 07:06

score 1 · Accepted Answer · answered Apr 07 '20 at 09:19

The default logistf is to return estimates based on profile penalized log likelihood. Most likely you are looking for something similar to a Wald confidence interval:

library(logistf)

outcome <- as.factor(c(0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1))
predictor <- c(0,5,5,5,5,5,5,5,5,5,5,5,1,0,0,0,0)

logistf(outcome ~ predictor,pl=FALSE)

logistf(formula = outcome ~ predictor, pl = FALSE)
Model fitted by Penalized ML
Confidence intervals and p-values by Wald 

                  coef  se(coef) lower 0.95 upper 0.95     Chisq         p
(Intercept)  3.9062500 3.1794521  -2.325362 10.1378616 1.5094387 0.2192248
predictor   -0.3853818 0.6646985  -1.688167  0.9174034 0.3361499 0.5620600

Likelihood ratio test=21.50504 on 1 df, p=3.528995e-06, n=17

Why do the p-value and confidence intervals not always match in logistf?

1 Answers1