I was watching Andrew Ng's lecture on the difference between parameter vs hyperparameter, https://www.youtube.com/watch?v=VTE2KlfoO3Q&ab_channel=Deeplearning.ai, and a question came to me.
Is there really that much of a distinction between hyperparameter vs parameter?
For example, weight is often regarded as a parameter as opposed to a hyperparameter. But recent papers have found that random search of the weight can obtain a good result, and beats state-of-the-art optimization methods https://arxiv.org/abs/1803.07055 Is this not the same method for hyperparameter tuning?
Simultaneously, there are papers that tunes the learning rates, optimizers and other so-called "hyperparameters" associated with a model. https://arxiv.org/abs/1606.04474
Then there are methods that directly learns the hyperparameters through gradient based methods. https://arxiv.org/abs/1903.03088
Another inspiration is adaptive control (huge field, spanning 5 decades now), the so-called "hyperparameters" associated with the controller is always learned.