A number of sources suggest that there are many negative consequences of the discretization (categorization) of continuous variables prior to statistical analysis (sample of references [1]-[4] below).
Conversely [5] suggests that some machine learning techniques are known to produce better results when continuous variables are discretized (also noting that supervised discretization methods perform better).
I am curious if there are any widely accepted benefits or justifications for this practice from a statistical perspective?
In particular, would there be any justification for discretizing continuous variables within a GLM analysis?
[1] Royston P, Altman DG, Sauerbrei W. Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med 2006;25:127-41
[2] Brunner J, Austin PC. Inflation of Type I error rate in multiple regression when independent variables are measured with error. The Canadian Journal of Statistics 2009; 37(1):33-46
[3] Irwin JR, McClelland GH. Negative consequences of dichotomizing continuous predictor variables. Journal of Marketing Research 2003; 40:366–371.
[4] Harrell Jr FE. Problems caused by categorizing continuous variables. http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/CatContinuous, 2004. Accessed on 6.9.2004
[5] Kotsiantis, S.; Kanellopoulos, D. "Discretization Techniques: A recent survey". GESTS International Transactions on Computer Science and Engineering 32(1):47–58.