Is there a measure of "complexity" for linear/nonlinear model terms?

Question

My apologies if this is grossly misunderstood or mis-worded, but I've been mildly bugged by a question to which I've not found a satisfactory answer. I can't say that I have seen a discussion about this in my discipline, which is why I feel like I've been searching in the dark here.

In sum, I was recently statistically evaluating different model fits using information criteria. That is, I had a dataset and I would fit models with different additive terms. I typically see model selection on linear models where the model terms are linear functions of predictor variables $X_i$, such as the $\beta _iX_i$ terms in the example $\hat{Y} = \beta _0 + \beta _1X_{1} + \beta _2X_{2} + \beta _3X_{3}$.

Using information criteria one can evaluate the elegance of model—how much information is explained relative to its complexity. From what I've seen, complexity is increased with each instance of a free parameter term (e.g., $\beta _i$ in the model above).

As I was beginning to include nonlinear terms I began to wonder about the measure complexity as an instance of an additive term with a free parameter: But what about the complexity of the nonlinear terms? From what I understand a linear function is measured to be as complex as a highly nonlinear function, so long as it has the same number of free parameters. For example

$$\hat{Y} = \beta _0 + \beta _1X_{1} + \beta _2X_{2} + \beta _3X_{3}$$

has the same model complexity as

$$\hat{Y} = \beta _0 + \beta _1X_{1}^2 + \frac{\sin\left(\beta _2X_{2}\right)}{1 + X_{2}^3} + \beta _3\log \left(X_{3}^{-1}\right).$$

I understand these two models to be the same in terms of the complexity of statistical fitting, but the second is much more mathematically complex.

I haven't fully reasoned why the latter model is more complex and why, if both explain the same amount of data, I'd argue that the former model is more elegant and parsimonious than the second. My gut and brain and telling me that the second is more complex, but I just need some guidance as to why that may or may not be the case.

I do not know the case of regression so well but in case of classification there is something called VC-dimension (see https://arxiv.org/abs/0810.4752 and/or https://en.wikipedia.org/wiki/Vapnik–Chervonenkis_dimension) which essentially describes how 'expressive' the class of models is that you can form out of this abstract parameter/equation like description. I'm sure there is something similar for the regression case. However, this is theory (i.e. could be difficult to compute/apply)... — Fabian Werner, Mar 25 '19 at 12:24
The same Wikipedia site mentions Pollard's pseudo-dimension as a generalization of Vapnik-Chervonenkis dimension to real-valued functions. Perhaps this is relevant. — Richard Hardy, Mar 25 '19 at 12:51
Chris, this CrossValidated thread might be helpful to you: https://stats.stackexchange.com/questions/2828/measures-of-model-complexity. In particular, the thread mentions Generalized Degrees of Freedom as a measure of model complexity worth exploring. — Isabella Ghement, Mar 25 '19 at 16:08
In practice, what we seem to be concerned with the most is "interpretation complexity" - the more complex a model is, the harder it is to interpret. If the purpose of the model is to uncover the nature of the effects of various predictors on the response variable, then interpretation is the primary concern, so models with lower complexity are generally preferred. If the purpose of the model is prediction, then interpretation could take a back seat, inviting more complexity. — Isabella Ghement, Mar 25 '19 at 16:54
Thanks @IsabellaGhement I should have mentioned that the nonlinear predictor terms that I am using are derived from lower principles, meaning there's a strong *a priori* mechanistic reason to use them. — Chris Moore, Mar 26 '19 at 12:18

Is there a measure of "complexity" for linear/nonlinear model terms?

0 Answers0