0

I have two equations of which I am trying to determine which is the better fit using AIC and BIC: a quadratic equation of the formula

$$\ y = β_{1}x^2+β_{2}x+β_{0}$$

and a fractional power equation with a formula of

$$\ y = β_{1}x^\frac{1}{2}+β_{0}$$

both of which I fitted using the lm() function in R. The AIC/BICs of the two equations are very close and it is very likely which model represents the better fit will depend heavily on the number of predictor terms I use for k. I am trying to figure out what the most appropriate k values would be for the second equation. I know that k usually equals the number of predictor terms plus 1, so for the quadratic regression equation k=3 and for a linear regression k=2.

However, I cannot seem to find what the number of predictor terms should be for a fractional exponent. I would assume k=2, since there is only one predictor term, but that predictor term does not exhibit a straight linear relationship and I am not sure if I need to adjust k to account for that. I've tried looking up k values for AICs, but none of the references I have been able to find discuss how to handle power equations at all.

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
user2352714
  • 181
  • 5
  • You terminology is confusing. First, there is no "exponential equation" shown. That would be of the form $y=\exp(x^2+x+c),$ for instance. Are there perhaps typographical errors? Second, it's unclear whether you are contemplating models of the form $y=\beta_1x^2+\beta_2x+c$ (three parameters) or, literally, $y=x^2+x+c$ (one parameter). As you can see, the value of $k$ depends on how one might interpret your notation, so please edit your post to clarify it. – whuber Feb 10 '21 at 20:15
  • @whuber Yes, you are right, I meant to say quadratic. I'm not sure if y=β1x2+β2x+c would be the correct way to say the equation, what I have is a basic quadratic equation created by fitting lm(y~I(x^2)+x) in R. I think what you said is correct, but I am not sure about the terminology (math is not usually my field). – user2352714 Feb 10 '21 at 20:26
  • 1
    Thank you for the edits. But isn't the answer to your question clear? $k$ is the number of coefficients in the model. There are some subtleties: if, for instance, you chose these two particular forms because they seem to fit the data well, then arguably you have at least one more *implicit* parameter representing the power of $x$ you chose to use ($2$ or $1/2$). – whuber Feb 10 '21 at 22:18
  • @whuber Right, my concern is that using x^(1/2) makes an additional assumption beyond a straight linear equation of y=B1x+B0. The two equations above aren't the only ones I'm examining, they are just the two that produce the best fit values in general, and are close enough that it's ambiguous which might be better. I tried nls fitting to see if it favored one over the other but I couldn't get it to consider both a power law and a quadratic equation as possibilities (got a singular gradient matrix error) – user2352714 Feb 10 '21 at 22:32
  • 1
    From the standpoint of statistical modeling, *both* models are "straight linear equations," because the linearity that matters is the linearity in the *parameters.* Multiple regression conceives of $x$, $x^2,$ $x^{1/2},$ and even the constant $1$ as "features" and treats them all on the same footing. See https://stats.stackexchange.com/a/148713/919 for a fuller account of this. – whuber Feb 10 '21 at 22:51

0 Answers0