2

I understand the view that ML is a big optimization problem where we are trying to minimize the loss function and achieve the most optimal solution given the input. To achieve that we are feeding a loss function (let's say accuracy) and an optimizer (let's say stochastic gradient descent) which is helpful for the model to tune the parameters if it is a parametric learner, unlike kNN. But after all, the loss function, optimizer, decision boundary shape in SVM, hidden layer count in NN, maximum depth of a tree in bagging or the base estimator in boosting are all hyperparameters that the user needs to tune considering bias and variance trade-off.

Assuming that we have unlimited resources, can't we just find the strongest model by using a large GridSearchCV with many hyperparameter combinations? It boils down to this: Is ML all about hyperparameter tuning? If not, what am I missing?

I am also asking for cases where interpretability is not that important and the sole purpose is achieving the highest testing score.

Thanks in advance!

realkes
  • 141
  • 3
  • 2
    Regarding close votes: I disagree that this question is opinion based. It is a "high-level" question (or, you might say "philosophical"), but it is about the core problems which ML practitioners need to solve in their work. Questions on a similar level, say, regarding Bayesian and frequentist statistics, are abundant on this site, but are often equally "opinion based". – Igor F. Feb 13 '22 at 07:41
  • @Igor Examples, please? – whuber Feb 13 '22 at 15:24
  • @whuber https://stats.stackexchange.com/questions/103625/when-are-bayesian-methods-preferable-to-frequentist https://stats.stackexchange.com/questions/258045/should-i-teach-bayesian-or-frequentist-statistics-first https://stats.stackexchange.com/questions/490738/reporting-using-bayesian-and-frequentist-statistics-interchangeably-in-a-study https://stats.stackexchange.com/questions/142124/what-frequentist-statistics-topics-should-i-know-before-learning-bayesian-statis https://stats.stackexchange.com/questions/281463/is-bias-a-frequentist-concept-or-a-bayesian-concept – Igor F. Feb 13 '22 at 18:13
  • 1
    @Igor Thank you. The first is a good example. Most of the rest are fairly specific questions. But please don't take the existence of or even tolerance of an apparently off-topic question to be a reason to permit additional off-topic questions to appear! – whuber Feb 13 '22 at 18:54
  • 1
    @whuber You're welcome. I get your point. In this particular case, I consider the question relevant. Although it does not allow for a clear yes/no answer, I think it's useful for identifying/formulating problem. – Igor F. Feb 13 '22 at 19:11
  • Off topic or not, I'd like to see this discussed. – Carl Feb 18 '22 at 09:05

2 Answers2

2

Indeed, a large part of ML is hyperparameter tuning, besides finding the appropriate method/model for your task, but I guess you could argue that this is part of hyperparameter tuning, too. Especially for the user that just needs to apply models to get answers, there is not much more than hyperparameter tuning. (OK, to get your job done in practice, you usually have to do lots of data preprocessing and create some kind of ML pipeline (i.e. all the data engineering stuff). That usually takes up 99% of your time, but this is arguably not really part of ML.)

For the designers of e.g. models, algorithms, and model selection methods, ML is more than this. It's a lot about stochastics, optimization, functional analysis, geometry, all kinds of math.

And it is an undeniable fact that you often do a much better job in hyperparameter tuning if you know the inner workings of the models.

frank
  • 1,434
  • 1
  • 8
  • 13
  • The last point is crucial. In most (perhaps all) contexts you have some computational constraints and can't just tune every hyperparam for every possible value without knowing what you are doing. You actually need to understand what each of them actually does to know if it's worth tuning and in what range and what it will approximately do when increased or when decreased. Do you also need to set other hyperparams differently because they interact with the first one? It does require experience and understanding. – isarandi Feb 16 '22 at 17:15
  • And in fact this isn't too different from other field, such as chemical engineering, let's say. – isarandi Feb 16 '22 at 17:16
0

It depends on your definition of "hyperparameters": Is the model itself a hyperparameter? The loss function? The choice of input features? The preprocessing method(s)? Etc.

If yes, then your view can be justified: Application of Machine Learning boils down to finding an optimal set of "hyperparameters". There is, however, a theory behind these algorithms, and it definitely involves more. People who invented LSTM or SVM certainly did more than just "hyperparameter tuning", at least in my view. But, you might find a definition of "hyperparameters" by which any activity which involves data processing with quantifiable results is "hyperparameter tuning".

Igor F.
  • 6,004
  • 1
  • 16
  • 41
  • Thanks @Igor for the answer. What I meant by hyperparameters is the part that the users tune themselves and not the parameters that are tuned by the model itself. – realkes Feb 13 '22 at 18:23