3

Need guidance, I am not new to reading and appraising articles in clinical journals but I do feel am very new to understanding regression (if at all) after sifting through discussions here! EPV is very important in modeling but many clinical papers published, perform (mostly) logistic regression analyses using multiple predictor variables with EPV < 5. Then, a table is produced showing e.g. 10 variables whether each was statistically significant or not after multivariate analysis. Is this right? At the end of the day, odds ratios are exponentiated coefficient, and if coefficients are biased, so would your conclusions?

Edit: (Taking Scortchi's advice to elaborate) A good EPV is important as shown in the article quoted below (Villinghof & McCullough) and this (I am sure I miss many others): Modern modeling techniques are data hungry: a simulation study for predicting dichotomous endpoints, by: Tjeerd van der Ploeg, Peter C. Austin, Ewout W. Steyerberg BMC medical research methodology, Vol. 14, No. 1. (22 December 2014), 137, doi:10.1186/1471-2288-14-137.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
saifulsafuan
  • 123
  • 7
  • An explanation of what "EPV" means & why it's "very important in modelling" might help to make you question clearer. – Scortchi - Reinstate Monica Oct 14 '16 at 13:28
  • I should've said to edit your question rather than to reply in comments. – Scortchi - Reinstate Monica Oct 14 '16 at 13:29
  • Thanks (+1). I still feel your thoughts on why it's important might help guide answerers. Anyway, [Vittinghof & McColloch (2006), *Am. J. Epidemiol.*, **165**, 6,"Relaxing the Rule of Ten Events per Variable in Logistic and Cox Regression"](https://aje.oxfordjournals.org/content/165/6/710.full.pdf) seems relevant. – Scortchi - Reinstate Monica Oct 14 '16 at 13:51
  • EPV is more of a design issue than an analysis one. After the data collection the confidence intervals carry the required information. – mdewey Oct 14 '16 at 14:32
  • How would design be involved? Say I perform a prospective cohort study and only manages to get 20 events & 200 non-events but have 10 predictor variables to assess in a multivariate analysis, wouldn't my coefficients be biased hence the odds ratios too? How would changing the design to case control help, and here I would be already limiting myself from information on few variables when I match controls to cases? – saifulsafuan Oct 14 '16 at 14:48
  • Thank you Kjetil b Halvorsen & Peter Flom for your guidance to the alternate question. The answers & comments there helped. – saifulsafuan Sep 07 '17 at 12:43

0 Answers0