0

I am doing some statistical analysis using Kaplan curve and cox analysis on a list of genes combining them with a single marker.

At KM curve these genes behave in different ways in terms of whatever is the prognosis confirmed later from cox analysis and the HR value.

then I thought to combine my "X" gene of interest with each of the genes presented in these lists. although I got really interesting results I am in doubts about data interpreting AND PRESENTING IN SCIENTIFIC PUBLICATION PAPER See follow the example:

enter image description here

THE YELLOW ONE (IN THE LEFT) SHOWS THE SINGLE MARKER COX ANALYSIS WHIHC REPORT THE VALUE OF MEANINGFUL IN HR OF THAT GENE AT LOW EXPRESSION, BUT WHEN I DO IN COMBINATION(RIGHT PANEL) WITH MY " a" GENES(RED) WHATEVER IS HIGH OR LOW, SOME OF THE COMBINED GENES WHICH BEFORE HAVE HR SIGNIFICANTIVE OR NOT AT LOW OR HIGH LEVEL, WHEN COMBINED WITH "a" THEY SHOWED HIGH AND MEANINGFUL HR. tHE POINT IS MY "A" GENE WHICH HAS AN HR VALUE OF 2.98.

My question is: which will be the criteria to select THE RIGHT COMBINATION AND PLOT ON Km curve to present in a scientific manuscript? IN OTHER WORDS, IS IT THAT RIGHT TO PRESENT THE COMBINATION THAT SOWHS THE HIGHER HR VALUE (WHEN IN COMBINATION) WITH MY "A" GENES OF INTEREST WHEN IS ON HIS OWN.

HOPE IT'S CLEAR.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Pros
  • 1
  • 1

1 Answers1

0

It's seldom a good idea to put too much importance on the results of a single-predictor Cox model or Kaplan-Meier curve. First, that doesn't allow correction for any correlations of that single predictor with other predictors that might be more strongly associated with outcome and might have stronger clinical justifications for inclusion in the model. Second, it can hurt you to omit any predictor that is associated with outcome, even if uncorrelated with your predictor of interest, as Cox models have the same bias toward low-magnitude coefficient estimates in that situation as seen with logistic regression.

Also, if you have a very large list of genes to combine in this way with your primary gene of interest, you need to be correcting for multiple comparisons. With more than a handful of genes, this is typically done by controlling for the false-discovery rate.

Should you find any combinations of genes still to be significant after that correction, you certainly could illustrate the results with Kaplan-Meier curves for some of those combinations (presumably 4 curves per plot in your case). I would prefer to see predictions and standard errors from Cox models that also take into account the standard clinical variables that are known to be associated with outcomes in your disease of interest, to help demonstrate that the associations of the gene expression levels with outcome don't just arise from associations of those expression levels with standard clinical predictors, and to give a sense of how reliable the associations with outcome are.

EdM
  • 57,766
  • 7
  • 66
  • 187