I would like to set up a series of tests on the difference in survival between two very unequal sized groups.
Generally either log-rank (using the R survdiff function) or a cox regression (R coxph) with stratified patient variables works well. However, in some cases one group is small and the event relatively low incidence, which makes the expected number of events very small. In those circumstances, it does not seem sensible to use the p-value generated by the log-rank test, since this is based on a chi-squared test, which is inappropriate for small numbers of expected events (surprisingly R does not give a warning message for this). Taking an admittedly fairly extreme example to illustrate:
survdiff(formula = survobject ~ (Fixation == i), data = TKRGroup)
n=637763, 424 observations deleted due to missingness.
N Observed Expected (O-E)^2/E (O-E)^2/V
Fixation == i=FALSE 637725 11174 1.12e+04 5.52e-04 11.9
Fixation == i=TRUE 38 3 5.17e-01 1.19e+01 11.9
Chisq= 11.9 on 1 degrees of freedom, p= 0.000555
Cox regression gives a higher p-value of 0.0023, though it still looks a rather on the low side for these values of observed and expected events.
coxph(formula = survobject ~ (Fixation == i), data = TKRGroup)
coef exp(coef) se(coef) z p
Fixation == iTRUE 1.76 5.8 0.577 3.05 0.0023
Further summary information gives
Likelihood ratio test= 5.58 on 1 df, p=0.01813
Wald test = 9.27 on 1 df, p=0.002325
Score (logrank) test = 11.92 on 1 df, p=0.0005543
At this point, I could do with some expert advice on which, if any, of these p-values to use, or whether there is some alternative approach available (preferably available within an R package!) Given the size of the groups, I rather naively attempted to get some idea of a sensible p-value by applying a Poisson exact test to the observed and expected figures; Values of observed / expected of 3 / 0.517 would give a cumulative Poisson P(X ≥ 3) = 0.0157. That seems a much more reasonable figure, though I am not sure I could defend it.