0

When approximating a set of points, with a model of a function and error, there is a general tool/criteria to decide how many parameters are optimal?

For example, a set of points in 2 dimension can be approached by regression, with polynomials. The higher the degree of the polynomial, the closest the points are approached, but after certain point, the polynomial turns meaningless, because it is just encoding the data in a different coordinate system (the polynomial constants), and loses predictive value.

Is there any way to evaluate the economy of parameters versus the predictive value? Is there a theory that deals with it? (I guess that it is a common problem).

Please, orient me on what mathematical tool/theory/criteria or algorithms are used for that.

I'm not looking for a precise answer, but a general orientation like "go look at that", "learn about XXX", "minimize/maximize YYY"

tutizeri
  • 111
  • 2
  • AIC, BIC, adjusted $R^2$, and out-of-sample performance are some terms that will interest you. – Dave May 21 '20 at 20:30
  • there's no general theory, just a bunch of heuristics. – Aksakal May 21 '20 at 20:30
  • There's no, moreover there are some tricky results showing that for some models having too many parameters is bad, while for others, it doesn't seem to be the case https://stats.stackexchange.com/questions/4284/intuitive-explanation-of-the-bias-variance-tradeoff/444014#444014 TL;DR there's no simple rule for that. – Tim May 21 '20 at 20:35
  • You are raising two different problems: to choose the degree of your polynomial you need *residual analysis*, to choose the best number of parameters you need algorithms (e.g. *forward stepwise regression*, *forward selection*, *backward elimination*) and criteria (*AIC*, *SBC*, *PRESS*, etc.) – Sergio May 21 '20 at 20:52
  • @Dave I would mark your comment as answer if you write it as answer. – tutizeri Jun 27 '20 at 01:27

0 Answers0