0

I am looking into how differing brain tumor genetics affects patient survival. I have a gene dataset with around 4600 predictors. Now I calculated a cox proportional hazards model using the implementation of the R survival package for each of these predictors, to see how they interact with patient metadata such as age, sex, and others.

I identified several genes this way that are good predictors for overall survival, but I have a feeling that I should correct for this repeated testing. Which methods should I use? Bonferrioni seems very conservative for my use case.

florian
  • 511
  • 1
  • 4
  • 12
  • 1
    Does this answer your question? [Correcting p values for multiple tests where tests are correlated (genetics)](https://stats.stackexchange.com/questions/2819/correcting-p-values-for-multiple-tests-where-tests-are-correlated-genetics). That provides extensive discussion. Control for false-discovery rate, described in answers on that page, is frequently the choice in this type of situation. – EdM Oct 12 '21 at 13:12
  • Thank you for linking this thread, indeed very informative. So it seems the standard p-value corrections for other methods such as Bonferroni can also be applied to the outputs of cox models? – florian Oct 13 '21 at 09:16
  • 1
    A list of p-values is a list of p-values, whether from linear regression or Cox models. Read about the [multiple comparisons problem](https://en.wikipedia.org/wiki/Multiple_comparisons_problem). There are interesting issues arising from lack of independence among the p-values, but in practice those issues seem to be overlooked. – EdM Oct 13 '21 at 12:43

0 Answers0