Testing nonlinearity in logistic regression (or other forms of regression)

Question

One of the assumption of logistic regression is the linearity in the logit. So once I got my model up and running I test for nonlinearity using Box-Tidwell test. One of my continuous predictors (X) has tested positive for nonlinearity. What am I suppose to do next?

As this is a violation of the assumptions shall I get rid of the predictor (X) or include the nonlinear transformation (X*X). Or transform the variable into a categorical? If you have a reference could you please point me to that too?

chl · Answer 1 · 2010-10-29T08:09:39.157

I would suggest to use restricted cubic splines (rcs in R, see the Hmisc and Design packages for examples of use), instead of adding power of $X$ in your model. This approach is the one that is recommended by Frank Harrell, for instance, and you will find a nice illustration in his handouts (§2.5 and chap. 9) on Regression Modeling Strategies (see the companion website).

You can compare the results with your Box-Tidwell test by using the boxTidwell() in the car package.

Transforming continuous predictors into categorical ones is generally not a good idea, see e.g. Problems Caused by Categorizing Continuous Variables.

score 6 · Answer 2 · answered Oct 29 '10 at 08:37

It may be appropriate to include a nonlinear transformation of x, but probably not simply x × x, i.e x². I believe you may find this a useful reference in determining which transformation to use:

G. E. P. Box and Paul W. Tidwell (1962). Transformation of the Independent Variables. Technometrics Volume 4 Number 4, pages 531-550. http://www.jstor.org/stable/1266288

Some consider the Box-Tidwell family of transformations to be more general than is often appropriate for interpretability and parsimony. Patrick Royston and Doug Altman introduced the term fractional polynomials for Box-Tidwell transformations with simple rational powers in an influential 1994 paper:

P. Royston and D. G. Altman (1994). Regression using fractional polynomials of continuous covariates: parsimonious parametric modeling. Applied Statistics Volume 43: pages 429–467. http://www.jstor.org/stable/2986270

Patrick Royston in particular has continued to work and publish both papers and software on this, culminating in a book with Willi Sauerbrei:

P. Royston and W. Sauerbrei (2008). Multivariable Model-building: A Pragmatic Approach to Regression Analysis Based on Fractional Polynomials for Modelling Continuous Variables. Chichester, UK: Wiley. ISBN 978-0-470-02842-1

score 5 · Answer 3 · answered Oct 29 '10 at 09:02

5

Don't forget to check for interactions between X and other independent variables. Leaving interactions unmodeled can make X look like it has a non-linear effect when it simply has a non-additive one.

answered Oct 29 '10 at 09:02

conjugateprior

19,431
1
55
83

Good point. I've only come across the converse: assuming an effect is linear when it isn't can lead to spurious statistical evidence for multiplicative interaction terms. – onestop Oct 29 '10 at 12:23
1

@onestop, do you have a reference about that? I believe it, but I'm having trouble figuring out exactly why that would happen. – Macro May 15 '12 at 12:23

Testing nonlinearity in logistic regression (or other forms of regression)

3 Answers3

Linked

Related