Categorizing a continuous predictor?

Question

If you had to split a continuous (independent) variable ranging from 0 to 9 and reflecting the number of x (e.g. number of cigarettes smoked), would you rather do:

Median split (but then also eliminating any difference between 0 and 1 cigarettes smoked)
Groups based on the frequency distribution
Groups consistent with theory (e.g. 0, 1-3, >3, but then having a larger N in the high risk group (n>3) than in the low risk one (n = 0): 15%, 35%, 50%)

What would be the best option? Or, what would you do instead?

Welcome to Cross Validated! I'd not categorize it, without a good theoretical reason to think that the risk was constant within each newly created category: see [What is the benefit of breaking up a continuous predictor variable?](http://stats.stackexchange.com/q/68834/17230) — Scortchi - Reinstate Monica, Jun 01 '15 at 14:34
thanks for the reply! but what if you had to work with repeated measurements? so if you had to test differences between groups of X across levels of a Y variable? — Joseph, Jun 01 '15 at 14:40
could you be more specific ? I'm running a repeated measures anova with 2 between-subjects and 1 within-subjects factors — Joseph, Jun 01 '15 at 14:45
I think you need to explain exactly what you're doing in the question, rather than drip-feeding information in a comment thread. — Scortchi - Reinstate Monica, Jun 01 '15 at 14:47
Sorry, I just meant to ask some clarifications about your reply. However, the point is that I have to run a repeated measure anova and I need to categorize one of my predictors to do so. So my question is still the same: what would you do if you had to? — Joseph, Jun 01 '15 at 14:54
You don't *have* to run a repeated measures ANOVA - instead use a mixed-effects regression model, which can allow for continuous predictors. (Please ignore what I said about interactions, I was mis-interpreting your comment.) — Scortchi - Reinstate Monica, Jun 01 '15 at 15:01

Categorizing a continuous predictor?

0 Answers0