1

I was trying to help a student with stats homework, but an example given in class has me a bit confused.

Background, to my understanding: A linear model is given to be one where $E[Y] = \beta_0 + \beta_1 X$. So for example, the model $E[Y] = \beta_0 + \beta_1 X^2$ can be framed as a linear model with the change of variable $X' = X^2$. Or $E[Y] = \exp{\beta_0 + \beta_1 X_1}$ with the change of variable $Y' = \ln{Y}$.

So then we get to the model $E[Y]=\beta_0 + \beta_1 X_1 + \beta_2 X_1^2 + \beta_3 X_1^3$. The professor says that, no problem, we can just substitute $R = X^2$ and $S = X^3$, and get $E[Y] = \beta_0 + \beta_1 X + \beta_2 R + \beta_3 S$, and we have a multiple linear regression with three variables.

It seems ridiculous to me to just hide non-linear terms this way and call them new variables. So ridiculous I'm having a hard time even finding the right foothold to criticize it. Could someone either explain why I'm wrong or help articulate my objection a bit more clearly?

Ashton Baker
  • 111
  • 2
  • Does this answer your question? [Why is polynomial regression considered a special case of multiple linear regression?](https://stats.stackexchange.com/questions/92065/why-is-polynomial-regression-considered-a-special-case-of-multiple-linear-regres) – Artem Mavrin Feb 11 '20 at 03:57
  • As you're estimating parameters, linearity in the parameters is the big deal. The variables are just numbers fed to a program and don't know where they came from, e.g. that you calculated powers of an original variable in this case. – Nick Cox Feb 11 '20 at 10:48

0 Answers0