Kaplan-Meier Subsetting/Selecting Strata for Comparison

Question

When comparing different strata how legitimate is it to merge or eliminate single strata, especially if they represent an intermediate level in order to provide a comparison. Each strata relate to three locations along a bone - distal, middle, proximal and its proclivity for reinjury after repair. For example when comparing three strata I got

Eliminating the middle strata.

Thank you in advance!

It's hard to answer this question as posed, as there's no information on what the strata represent. If they are binned groupings of a continuous variable you probably [shouldn't even analyze your data this way](https://stats.stackexchange.com/q/68834/28500), reserving Kaplan-Meier plots for display of illustrative situations. Please edit your question to include more information about the nature of your strata. Comments are easy to overlook and can get deleted. — EdM, Oct 14 '21 at 14:21
@EdM Each strata relate to three locations along a bone - distal, middle, proximal and its proclivity for reinjury after repair. Put it in above! — Cenoc, Oct 14 '21 at 14:26

score 0 · Answer 1 · answered Oct 15 '21 at 16:50

Changing your analysis in the way that you propose would be considered inappropriate in terms of any claim of "statistical significance" for the final nominal p-value. You altered your analysis based on results that you found in an initial analysis, so the assumptions that go into defining a p-value don't hold for the second analysis. You certainly can explore your data that way to help design future studies, but you shouldn't report a "statistically significant" result based on your second Kaplan-Meier plot.

In your case, the initial analysis might have been inadequate. It did not take the ordering of your location predictor in terms of "proclivity for reinjury" into account. The log-rank test used to evaluate differences among strata didn't incorporate the information that you expect an ordering in survival curves among the strata.

You might consider coding location as an ordinal categorical predictor and performing Cox regression instead. In R, ordered categorical predictors are coded differently from unordered predictors, in a way that takes the order into account. That linked thread is a good place to start. If you do further web searches for "ordinal" be careful, as much of what you find might have to do with ordinal outcome variables rather than ordinal predictor variables.

Kaplan-Meier Subsetting/Selecting Strata for Comparison

1 Answers1