0

I am applying a logistic regression on the effect of dose, age, PS, menopausal and pairID on the response variable. The data come from a case-control study where controls were matched/paired based on age, PS and menopausal. The pair ID is then recorded as PairID and is also a covariate in the model.

The model is:

    mod2 <- glm(response ~ dose + Age + PS + menopausal + PairID, 
                family = binomial(link = "logit"), data=dat)

The model output shows the estimated coefficient:

    summary(mod2)

    Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
    -0.89847  -0.36858  -0.05885  -0.02615   3.00978  
    
    Coefficients: (2 not defined because of singularities)
                     Estimate Std. Error z value Pr(>|z|)

    
    (Intercept)     -32.73273   11.42767  -2.864 0.004179 ** 
    MaxDose           0.07984    0.02287   3.491 0.000482 ***
    Age               0.62496    0.27268   2.292 0.021910 *  
    PS1              -0.39054    0.89232  -0.438 0.661628    
    PS2              17.09581 1268.45566   0.013 0.989247    
    PS3              15.76341 1833.21516   0.009 0.993139    
    menopausalpost  -12.55850    5.28287  -2.377 0.017444 *  
    menopausalmale  -23.87464   10.51685  -2.270 0.023200 *  
    PairID2           5.14332    3.02191   1.702 0.088754 .  
    PairID3           3.60030    2.20299   1.634 0.102200    
    PairID4         -13.42706    6.47253  -2.074 0.038036 *  
    PairID5          -0.16041    1.50785  -0.106 0.915276    
    PairID6         -13.20065 1268.45572  -0.010 0.991697    
    PairID7         -19.62178 1833.21665  -0.011 0.991460    
    PairID8          -7.08125    3.55520  -1.992 0.046393 *  
    PairID9          -0.05779    1.84367  -0.031 0.974994    
    PairID10         -6.59761    3.44665  -1.914 0.055593 .  
    PairID11          3.13177    1.55304   2.017 0.043744 *  
    PairID12          9.04193    4.16930   2.169 0.030106 *  
    PairID13         15.00443    7.03528   2.133 0.032946 *  
    PairID14          1.00987    1.58307   0.638 0.523527    
    PairID15          5.45367    2.91475   1.871 0.061336 .  
    PairID16         -3.13399    2.18068  -1.437 0.150672    
    PairID17          6.85244    3.63025   1.888 0.059081 .  
    PairID18          5.60300    3.02755   1.851 0.064217 .  
    PairID19               NA         NA      NA       NA    
    PairID20        -11.53556    5.23541  -2.203 0.027569 *  
    PairID21        -11.09677    5.23554  -2.120 0.034048 *  
    PairID22          6.15099    3.16279   1.945 0.051799 .  
    PairID23               NA         NA      NA       NA    
    PairID24        -18.62866 1268.45749  -0.015 0.988283    
    PairID25          4.83989    1.81209   2.671 0.007565 ** 
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
    
    (Dispersion parameter for binomial family taken to be 1)
    
        Null deviance: 226.13  on 540  degrees of freedom
    Residual deviance: 167.62  on 511  degrees of freedom
    AIC: 227.62
    
    Number of Fisher Scoring iterations: 17

From this output, seems that the covariate menopausal has an significant effect. However, If I apply analysis of deviance to test the significance of each covariate, covariate menopausal shows df=0. Does anyone know why?

    drop1(mod2, test = "Chisq")

    Single term deletions
    
    Model:
    collapsed ~ MaxDose + Age + PS + menopausal + PairID
               Df Deviance    AIC    LRT  Pr(>Chi)    
    <none>          167.62 227.62                     
    MaxDose     1   213.08 271.08 45.462 1.557e-11 ***
    Age         1   175.22 233.22  7.602  0.005829 ** 
    PS          3   173.49 227.49  5.868  0.118227    
    menopausal  0   167.62 227.62  0.000              
    PairID     22   186.31 202.31 18.684  0.664773 
kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
tiantianchen
  • 1,661
  • 2
  • 18
  • 31
  • 1
    I don't know, but you have some very large coefficients there with enormous standard errors. [Separation](http://stats.stackexchange.com/questions/11109/how-to-deal-with-perfect-separation-in-logistic-regression/68917)? – Scortchi - Reinstate Monica Nov 29 '13 at 21:15
  • I need to mention that the data come from a case-control study where controls were matched/paired based on age, PS and menopausal. The pair ID is then recorded as PairID and is also put in the model. Could it be such relation between PairID and the rest covariats that lead to the strange output? – tiantianchen Nov 29 '13 at 21:25

0 Answers0