4

I am always struggling with normality testing for quantitative predictors (no factors) and transforming them to normality.

  • If I am running a GLMM and my predictors are really non-normal, should I transform them as well to try to make them normally distributed?
  • I know that this is important for the response variable but what should be done with predictors?

P.S.: I really could not find a similar question.

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
Jens
  • 1,384
  • 1
  • 14
  • 31

1 Answers1

17

There is nothing in the theory behind regression models that requires any distribution for X other than having a minimum number of observations in each range of X for which you want to learn something. The only problem you usually run into is overly influential observations due to a heavy right tail of the distribution of X. To deal with that I often fit something like a restricted cubic spline in the cube root or square root of X. In the R rms package this would look like y ~ rcs(x^(1/3)) + ... other variables or rcs(sqrt(x),5) + ... (5=5 knots using default knot placement). That way you only assume a smooth relationship but you limit the influence of large values, while allowing for zeros (though not negative values).

Gavin Simpson
  • 37,567
  • 5
  • 110
  • 153
Frank Harrell
  • 74,029
  • 5
  • 148
  • 322
  • many thanks! in my case the predictors are not that awfully right tailed. but good to know that rms exists. thanks for sharing your approach. – Jens Jul 06 '11 at 15:44