5

I am running conditional logistic regression in R using clogit(). I have 314 different strata with 1 case and 1 control in each stratum (628 observations in total). Several predictor variables have missing values, therefore 6 observations are excluded from the analysis. Now I have 622 observations with 310 events. Two strata now contain 0 case and 1 control. I thought such strata would be omitted from the analysis, however it is not the case. 622 residuals are reported. How does clogit handle strata where pairs have one value missing?

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Olga Dem
  • 51
  • 1
  • "strata" is already plural. (one *stratum*, two *strata*). I have made some small edits. Please check it still conveys your intention. – Glen_b Mar 18 '17 at 07:16

2 Answers2

3

The residual will be zero for the remaining observation in that stratum. There's no need to remove it, since it doesn't provide any information if there were only two observations in the stratum.

> library(survival)
> data(retinopathy)
> head(retinopathy)
  id laser   eye age     type trt futime status risk
1  5 argon  left  28    adult   1  46.23      0    9
2  5 argon  left  28    adult   0  46.23      0    9
3 14 argon right  12 juvenile   1  42.50      0    8
4 14 argon right  12 juvenile   0  31.30      1    6
5 16 xenon right   9 juvenile   1  42.27      0   11
6 16 xenon right   9 juvenile   0  42.27      0   11
> 
> allmodel<- clogit(status~trt+strata(id),data=retinopathy)
> allmodel
Call:
clogit(status ~ trt + strata(id), data = retinopathy)

      coef exp(coef) se(coef)      z       p
trt -1.371     0.254    0.280 -4.896 9.8e-07

Likelihood ratio test=29.9  on 1 df, p=4.544e-08
n= 394, number of events= 155 
> resid(allmodel)[1:4]
         1          2          3          4 
 0.0000000  0.0000000 -0.2025316  0.2025316 
> 
> retinopathy$trt[3]<-NA
> missmodel<- clogit(status~trt+strata(id),data=retinopathy)
> missmodel
Call:
clogit(status ~ trt + strata(id), data = retinopathy)

       coef exp(coef) se(coef)      z        p
trt -1.3545    0.2581   0.2804 -4.831 1.36e-06

Likelihood ratio test=28.97  on 1 df, p=7.344e-08
n= 393, number of events= 155 
   (1 observation deleted due to missingness)
> resid(missmodel)[1:3]
1 2 4 
0 0 0

[If there had been more than two observations in the stratum to begin with, the stratum would still be informative, of course. The residuals would then be what you'd expect for the remaining data in the stratum]

Thomas Lumley
  • 21,784
  • 1
  • 22
  • 73
0

In principle, strata with missing values on the control and/or case observation should be removed from the analysis. I haven't used "clogit" command recently but I am pretty sure this is automatically done or at least an error/warning message is displayed during estimation. Otherwise you could remove incomplete strata manually (As a general rule of thumb, I think that it is actually better to do data cleaning yourself rather than relying on estimation commands).

Nicolas K
  • 859
  • 7
  • 14