Interpreting and comparing linear and quadratic regression

Question

I would like to compare the affect of parameters x and z on dependent variable y. I'm not sure how to know whether z or x is 'better'/'stronger'/'more likely to be a driver' of y.

For x, when I plotted the data I noticed a quadratic relationship lm(y~ x^2)

I wrote the polynomial regression code to call from my data frame dat_CVlike this:

lm(dat_CV[[y]] ~ dat_CV[[x]] + I(dat_CV[[x]]^2) , data= dat_CV)

My output for model using x is

Residuals:
    Min      1Q  Median      3Q     Max 
-0.1671 -0.0685  0.0227  0.0665  0.1144 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)   
(Intercept)       1.143040   0.230929    4.95   0.0017 **
dat_CV[[x]]       0.093053   0.022701    4.10   0.0046 **
I(dat_CV[[x]]^2) -0.001987   0.000477   -4.16   0.0042 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.101 on 7 degrees of freedom
Multiple R-squared:  0.713, Adjusted R-squared:  0.63 
F-statistic: 8.68 on 2 and 7 DF,  p-value: 0.0127

The relationship for y~z was linear

lm(dat_CV[[y]] ~ dat_CV[[z]], data = dat_CV)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.0946 -0.0638 -0.0369  0.0943  0.1073 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)  -0.0307     0.4418   -0.07   0.9463   
dat_CV[[z]]   3.1370     0.6682    4.69   0.0016 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.0911 on 8 degrees of freedom
Multiple R-squared:  0.734, Adjusted R-squared:   0.7 
F-statistic:   22 on 1 and 8 DF,  p-value: 0.00155

Regarding the quadratic results of model x

i) Im not sure how to interpret the p values. Can I phrase the result in a paper like this:

Parameter x was found to have a significant quadratic relationship with y (F= 8.68, 2,7, p=0.013)

Should I be reporting the p-value of I(dat_CV[[x]]^2) or both rows instead or as well as the overall model?

ii) How do I interpret the fact that the p-values are significant at p<0.05 for each parameter but not for the overall model?

Comparing the two models

iii) Can I use the $R^2$ to compare the linear and quadratic models? If not so can I compare the Residual standard errors to say which model is a better goodness-of-fit?

i.e y~x^2 resid.s.e. = 0.101

y~z resid.s.e = 0.091

Therefore y~z is a 'slightly' better fit? (I know here the s.e. are almost the same but in other comparisons the difference between models was much bigger so I want to understand the meaning)

Does this mean that z is a 'better' predictor of y, even though both had significant p-values?

iv) Since the estimate for a quadratic is no longer the slope like in a linear regression, how can I evaluate the 'size'/'strength' of the correlation to compare between models?

All the models you mention are linear in the parameters (and in the vectors of predictors). [*Nonlinear regression*](https://en.wikipedia.org/wiki/Nonlinear_regression) is usually reserved for the case where the model is not linear in the parameters, rather than merely curved in some original $x$. So - for example - polynomial regression is generally referred to as multiple linear regression rather than non-linear regression. — Glen_b, Nov 30 '17 at 01:14
If you want to compare the effects, then you should use models that include *all* variables. Using regression methods alone, you have no basis for claiming any of them are "drivers" for $y$: all you can do is study how they are associated with $y$. — whuber, Nov 30 '17 at 23:41

score 1 · Answer 1 · answered Dec 04 '18 at 02:48

As was pointed out in the comments you need to include all of your variables in the model to understand importance. A simple and effective way to understand a variable's importance with respect to the ability of your model to make good predictions is to use the Mean Decrease in Accuracy (which can be used to understand the effect of a variable on any score, like MSE). Make sure you apply this technique to data that was not used (hold out data) to build the model.

Interpreting and comparing linear and quadratic regression

1 Answers1