14

In general, I'm wondering if there it is ever better not to use orthogonal polynomials when fitting a regression with higher order variables. In particular, I'm wondering with the use of R:

If poly() with raw = FALSE produces the same fitted values as poly() with raw = TRUE, and poly with raw = FALSE solves some of the problems associated with polynomial regressions, then should poly() with raw = FALSE always be used for fitting polynomial regressions? In what circumstances would it be better not to use poly()?

Scortchi - Reinstate Monica
  • 27,560
  • 8
  • 81
  • 248
user2374133
  • 143
  • 1
  • 6

2 Answers2

17

Ever a reason? Sure; likely several.

Consider, for example, where I am interested in the values of the raw coefficients (say to compare them with hypothesized values), and collinearity isn't a particular problem. It's pretty much the same reason why I often don't mean center in ordinary linear regression (which is the linear orthogonal polynomial)

They're not things you can't deal with via orthogonal polynomials; it's more a matter of convenience, but convenience is a big reason why I do a lot of things.

That said, I lean toward orthogonal polynomials in many cases while fitting polynomials, since they do have some distinct benefits.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
  • is it possible to compare the coefficients resulting from an orthogonal polynomial regression to hypothesized values? – user2374133 Jun 20 '14 at 04:55
  • 2
    Yes. You can transform them back to the implied coefficients and standard errors from the "raw" polynomials, for example. – Glen_b Jun 20 '14 at 05:03
  • 3
    More often than not, converting from the orthogonal polynomial basis to the monomial basis is an ill-conditioned process (for high degrees; low-degree conversion is not too bad), so if one is *a priori* interested in the monomial basis coefficients, any numerical stability you gained from using the orthogonal polynomials is thrown out the window at conversion, so you might as well use monomials at the outset. *Caveat emptor*, of course. – J. M. is not a statistician Jan 27 '17 at 15:07
  • 2
    @J.M. Thanks, that's an excellent point. Fortunately it would be very rare in statistical applications these days to fit more than a fairly low order polynomial (my usual advice is that unless there's a strong theoretical reason to go above degree three or four, one should look at different approaches -- which alternative might be best depends on the circumstances, but things like splines, for example, may be suitable for some situations.). – Glen_b Jan 27 '17 at 23:41
14

Because if your model leaves R when it grows up, you have to remember to pack its centring & normalization constants, & then it has to lug them around the whole time. Imagine coming across it one day hard-coded into SQL, & the horror of realizing it's mislaid them!

Scortchi - Reinstate Monica
  • 27,560
  • 8
  • 81
  • 248