0

I performed a simple linear regression on my data, but I want to increase the model complexity, using R.

I was told to use:

lm(y ~ poly(x, degree = n))

to increase the degree of the polynomial to $n$, but this doesn't seem right to me.

For example, I get different estimates and standard errors when I compare:

lm(y ~ x)

and

lm(y ~ poly(x, degree = 1))

Shouldn't they be the same? Sorry if this isn't much information. Still learning R and statistics.

Silverfish
  • 20,678
  • 23
  • 92
  • 180
mdlee6
  • 23
  • 1
  • 1
  • 3
  • 2
    Please explain why you would want to make a model more complex. Ordinarily that is not desirable: it is forced on us when we confront a model with data and find it does not fit well. The problem then is to *improve* the model--but not for the sake of making it more complex! – whuber Sep 23 '16 at 22:44
  • @whuber Agreed! Conceptually, the purpose of a model is to reduce the complexity of the real world into something comprehensible. Not the other way around! A "more complex" model could be built by simply adding more "x or y variables". However... by "complex" do you mean, say, "larger", "more realistic", or "less understandable" (for its own sake), etc.? – Graeme Walsh Sep 23 '16 at 22:57
  • http://stackoverflow.com/questions/3822535/fitting-polynomial-model-to-data-in-r That looks like it will answer the question that was asked. Whether the OP *should* do this is another question entirely. – thecity2 Sep 23 '16 at 23:11
  • Actually it was for a given problem that I'm trying to solve. I was asked to increase to complexity to a polynomial of degree 16, then compare the RMSEs with the degrees of the polynomials. The link @thecity2 pretty much helped me out! Thanks guys! – mdlee6 Sep 23 '16 at 23:42
  • I have heard R can be good for machine learning. But can I use it to train a [Rube Goldberg machine](https://en.wikipedia.org/wiki/Rube_Goldberg_machine)? – GeoMatt22 Sep 24 '16 at 04:33
  • 1
    @batmac The comment above should be moved to the main body of the question. To increase the _"model complexity"_ and _"to increase the complexity to a polynomial of degree 16"_ are two different things altogether. – Graeme Walsh Sep 24 '16 at 10:33

1 Answers1

1
  • There are many ways to increase model complexity. Using polynomial expansion is one way. Other ways include: 1. using other basis expansions, such as Fourier basis expansion. 2. using splines. Examples can be found in this post. What's wrong to fit periodic data with polynomials?

  • If you want to use 16th order polynomial, you may consider using orthogonal polynomial instead of raw polynomial. Reasons can be found in this post. Why are there large coefficents for higher-order polynomial

  • If you want to know why lm(y~x) and lm(y~poly(x,1)), you can try to use model.matrix function to see, how the basis expansion changed your data. Here is an example on mtcars data. Note, even the order is 1, it changed your data.

    > head(model.matrix(mpg~wt,mtcars))
                      (Intercept)    wt
    Mazda RX4                   1 2.620
    Mazda RX4 Wag               1 2.875
    Datsun 710                  1 2.320
    Hornet 4 Drive              1 3.215
    Hornet Sportabout           1 3.440
    Valiant                     1 3.460
    > head(model.matrix(mpg~poly(wt,1),mtcars))
                      (Intercept)   poly(wt, 1)
    Mazda RX4                   1 -0.1096309987
    Mazda RX4 Wag               1 -0.0628232889
    Datsun 710                  1 -0.1646988925
    Hornet 4 Drive              1 -0.0004130092
    Hornet Sportabout           1  0.0408879112
    Valiant                     1  0.0445591041
    

However, the fitted values does not change, if the order is 1.

    > head(as.vector(lm(mpg~wt,mtcars)$fitted.values))
    [1] 23.28261 21.91977 24.88595 20.10265 18.90014 18.79325

    > head(as.vector(lm(mpg~poly(wt,1),mtcars)$fitted.values))
    [1] 23.28261 21.91977 24.88595 20.10265 18.90014 18.79325
Haitao Du
  • 32,885
  • 17
  • 118
  • 213