In the output from a cox regression I get several p values from one variate - one for each level of the variate bar one level. I want to know the effect of the variate on the outcome (adjusted by other variates) and therefore one p value would suffice. (I can't seem to use anova because I'm using the R survival function survSplit - although that would be good). I am therefore stuck with trying to interpret the p values for individual levels of the variate.
My question is - what exactly do they mean ? - and when do I know that the variate has a significant effect on the outcome ? Do all the p values for each level of the variate have to be significant to conclude that the variate has a significant effect ?
The p values seem to be calculated with reference to one of the levels of the variate (presumably treated as a reference level) - but is that useful ? As an example the outcome could be death with genetic-time groups given by survSplit and the variate of interest could be an ordinal variable e.g. the nodal status of a patient.
In regression I normally think of whole variates, not different levels, as having a regression coefficient. Does that mean that here each level of a variate has its own regression coefficient which is independent of all levels apart from the reference level ? and if it is significant it can be interpreted as having an association with the outcome/time (=hazard) that is relative to the reference level's association with the outcome/time ? (note that this doesn't seem very useful if I don't know what the reference level's association with outcome/time actually is !?)
Unfortunately anova.rms and base anova will not work directly (at least with my data) on one R survival coxph(survSplit) object, presumably because the groups:strata(time_group) parameter which has two levels always gives one of these levels (for each time) with a line of NAs and se(coef) of zero: as in the following results from model
coxph(Surv(tstart, time, BCSSsplit) ~ groupsplit:strata(time_group) + Nodal_status_CATEGOR_CATEGOR, data = BCSSsplitdata)
se(coef) z p
Nodal_status_CATEGOR_CATEGORa 2.959e-01 2.071 0.0384
Nodal_status_CATEGOR_CATEGORb 2.698e-01 5.643 1.67e-08
Nodal_status_CATEGOR_CATEGORc 2.964e+03 -0.005 0.9959
groupsplitHer2+:strata(time_group)time_group=1 2.774e-01 -4.112 3.92e-05
groupsplitTNBC:strata(time_group)time_group=1 0.000e+00 NA NA
groupsplitHer2+:strata(time_group)time_group=2 3.930e-01 2.563 0.0104
groupsplitTNBC:strata(time_group)time_group=2 0.000e+00 NA NA
The error given is Error in .rowNamesDF<-(x, value = value) : invalid 'row.names' length
anova will however compare two such objects - so I'll have to settle for that unless someone knows a method whereby anova will handle such a coxph object. (The time split (or other adjustment) is necessary following Schoenfeld testing for proportional hazards.)