If I hypothesize that a gene signature will identify subjects at a lower risk of recurrence, that is decrease by 0.5 (hazard ratio of 0.5) the event rate in 20% of the population and I intend to use samples from a retrospective cohort study does the sample size need to be adjusted for unequal numbers in the two hypothesised groups?
For example using Collett, D: Modelling Survival Data in Medical Research, Second Edition - 2nd Edition 2003. The required total number of events, d, can be found using,
\begin{equation} d = \frac{(Z_{\alpha/2} + Z_{\beta/2})^2}{p_1 p_2 (\theta R)^2} \end{equation}
where $Z_{\alpha/2}$ and $Z_{\beta/2}$ are the upper $\alpha/2$ and upper $\beta/2$ points, respectively, of the standard normal distribution.
For the particular values,
- $p_1 = 0.20$
- $p_2 = 1 - p_1$
- $\theta R = -0.693$
- $\alpha = 0.05$ and so $Z_{0.025}= 1.96$
- $\beta = 0.10$ and so $Z_{0.05} = 1.28$,
and taking $\theta R = \log \psi R = \log 0.50 = -0.693$, the number of events required (rounded up) to have a 90% chance of detecting a hazard ratio of 0.50 to be significant at the two-sided 5% level is then given by
\begin{equation} d = \frac{(1.96 + 1.28)^2}{0.20 \times 0.80\times (\log 0.5)^2}= 137 \end{equation}