2

I am using neural network algorithms for a relatively large dataset with 1700 obs and 40 features.
I performed optimizing by nested cross validation.
I also wanted to compare 5 algorithm with each other by benchmarking.
When I select about 5 hyperparameters to be tuned (number of nodes, layers, alpha, dropout, epochs),it takes a lot of time for computer to calculate, then I canceled it out.

As Tunning of many of the hyperparameters especially on a large dataset with many features, is so computationally expensive, Is it allowed that we select just a limited subset of hyperparameters (eg.just Num_nodes and dropout) and not all or many of them to be tuned?
I searched it in google and SO questions but did not find the answer.
I appereciate your kind help.

Killbill
  • 177
  • 4
  • 3
    You can fix some hyperparameters, sure. A valid response from a boss or a reviewer is, "How did you pick that value?" If you found it through cross validation, that is straightforward to defend. If you just picked because it is your lucky number or your birthday, that might be hard to defend. – Dave Apr 19 '21 at 16:47
  • 2
    You don’t need to do grid search, by the way. Randomized search performs very well in less time. There’s also the Bayesian optimization route. – Arya McCarthy Apr 19 '21 at 16:49
  • Many thanks. I want to know that In your practices, do you optimize all hyperparameter(even if there are many and sample volume is large?) – Killbill Apr 19 '21 at 17:41
  • 1
    Something that can be defended is mimicking an architecture that you know works for a similar problem. – Dave Apr 19 '21 at 17:42
  • Aha, Ok, thanks. Or may be I have to buy a faster system capable of analysis of massive data. – Killbill Apr 19 '21 at 17:48
  • @Arya McCarthy, thanks, but my question was about selecting of hyperparameters for optimization. Actually I used random_search for optimizing. But I wanted to optimized maximum of one or two hyperparameters not many. – Killbill Apr 19 '21 at 17:52
  • 1
    It's easy to run some simple deep learning, have it complete in a reasonable amount of time, and dismiss claims that deep learning requires beefy hardware. Then you get into doing thousands of cross validation training runs, and that twenty minutes of training time turns into two weeks. – Dave Apr 19 '21 at 17:56
  • 1
    In random search, you make two decisions. (1) What distribution(s) do I use for selection hyperparameter values at random? For example, what interval am I sampling from? (2) How many tuples do you test? From the standpoint of computational budget, only (2) can increase/decrease the required budget. You can test as many hyperparameters as you wish. – Sycorax Apr 19 '21 at 18:16
  • @Sycorax, thanks, So is it true to say that: "it is optimal that I select as much as hyperparameters to be tuned, But because of lack of heavy computational capability of my system, I have to choose some of them". But as mentioned in the comments, we have nothing to defend if someone criticize that why we did not select some hyperparameters. – Killbill Apr 19 '21 at 18:24
  • 2
    No, I'm saying you can randomly search to tune over all of your desired parameters, but when you're time-constrained, you're limited in by how many parameter *tuples* that you can test in the allotted time. – Sycorax Apr 19 '21 at 21:21

0 Answers0