Should quantitative predictors be transformed to be normally distributed?

Question

I am always struggling with normality testing for quantitative predictors (no factors) and transforming them to normality.

If I am running a GLMM and my predictors are really non-normal, should I transform them as well to try to make them normally distributed?
I know that this is important for the response variable but what should be done with predictors?

P.S.: I really could not find a similar question.

There are *loads* of threads on exactly this subject. Follow the `data-transformation` tag. — whuber, Jul 06 '11 at 15:24
@whuber, I think once I read your response to this indeed FAQ, but I have not found the thread :( — Dmitrij Celov, Jul 06 '11 at 15:31

score 17 · Accepted Answer · edited Jul 06 '11 at 15:38

There is nothing in the theory behind regression models that requires any distribution for X other than having a minimum number of observations in each range of X for which you want to learn something. The only problem you usually run into is overly influential observations due to a heavy right tail of the distribution of X. To deal with that I often fit something like a restricted cubic spline in the cube root or square root of X. In the R rms package this would look like y ~ rcs(x^(1/3)) + ... other variables or rcs(sqrt(x),5) + ... (5=5 knots using default knot placement). That way you only assume a smooth relationship but you limit the influence of large values, while allowing for zeros (though not negative values).

many thanks! in my case the predictors are not that awfully right tailed. but good to know that rms exists. thanks for sharing your approach. — Jens, Jul 06 '11 at 15:44

Should quantitative predictors be transformed to be normally distributed?

1 Answers1

Linked

Related