Questions tagged [hyperparameter]

A parameter that is not strictly for the statistical model (or data generating process), but a parameter for the statistical method. It could be a parameter for: a family of prior distributions, smoothing, a penalty in regularization methods, or an optimization algorithm.

561 questions
56
votes
6 answers

Practical hyperparameter optimization: Random vs. grid search

I'm currently going through Bengio's and Bergstra's Random Search for Hyper-Parameter Optimization [1] where the authors claim random search is more efficient than grid search in achieving approximately equal performance. My question is: Do people…
Bar
  • 2,492
  • 3
  • 19
  • 31
52
votes
1 answer

Do we have to tune the number of trees in a random forest?

Software implementations of random forest classifiers have a number of parameters to allow users to fine-tune the algorithm's behavior, including the number of trees $T$ in the forest. Is this a parameter that needs to be tuned, in the same way as…
Sycorax
  • 76,417
  • 20
  • 189
  • 313
39
votes
3 answers

Guideline to select the hyperparameters in Deep Learning

I'm looking for a paper that could help in giving a guideline on how to choose the hyperparameters of a deep architecture, like stacked auto-encoders or deep believe networks. There are a lot of hyperparameters and I'm very confused on how to choose…
32
votes
3 answers

How to build the final model and tune probability threshold after nested cross-validation?

Firstly, apologies for posting a question that has already been discussed at length here, here, here, here, here, and for reheating an old topic. I know @DikranMarsupial has written about this topic at length in posts and journal papers, but I'm…
32
votes
2 answers

What is the reason that the Adam Optimizer is considered robust to the value of its hyper parameters?

I was reading about the Adam optimizer for Deep Learning and came across the following sentence in the new book Deep Learning by Bengio, Goodfellow and Courville: Adam is generally regarded as being fairly robust to the choice of hyper parameters,…
27
votes
4 answers

How should Feature Selection and Hyperparameter optimization be ordered in the machine learning pipeline?

My objective is to classify sensor signals. The concept of my solution so far is : i) Engineering features from raw signal ii) Selecting relevant features with ReliefF and a clustering approach iii) Apply N.N, Random Forest and SVM However I am…
Grunwalski
  • 495
  • 2
  • 6
  • 11
27
votes
6 answers

Is hyperparameter tuning on sample of dataset a bad idea?

I have a dataset of 140000 examples and 30 features for which I am training several classifiers for a binary classification (SVM, Logistic Regression, Random Forest etc) In many cases hyperparameter tuning on the whole dataset using either Grid or…
24
votes
2 answers

Natural interpretation for LDA hyperparameters

Can somebody explain what is the natural interpretation for LDA hyperparameters? ALPHA and BETA are parameters of Dirichlet distributions for (per document) topic and (per topic) word distributions respectively. However can someone explain what it…
abhinavkulkarni
  • 778
  • 1
  • 6
  • 15
23
votes
4 answers

How bad is hyperparameter tuning outside cross-validation?

I know that performing hyperparameter tuning outside of cross-validation can lead to biased-high estimates of external validity, because the dataset that you use to measure performance is the same one you used to tune the features. What I'm…
Ben Kuhn
  • 5,373
  • 1
  • 16
  • 27
22
votes
3 answers

How to get hyper parameters in nested cross validation?

I have read the following posts for nested cross validation and still am not 100% sure what I am to do with model selection with nested cross validation: Nested cross validation for model selection Model selection and cross-validation: The right…
19
votes
2 answers

Is decision threshold a hyperparameter in logistic regression?

Predicted classes from (binary) logistic regression are determined by using a threshold on the class membership probabilities generated by the model. As I understand it, typically 0.5 is used by default. But varying the threshold will change the…
Nick
  • 393
  • 2
  • 9
19
votes
5 answers

What's in a name: hyperparameters

So in a normal distribution, we have two parameters: mean $\mu$ and variance $\sigma^2$. In the book Pattern Recognition and Machine Learning, there suddenly appears a hyperparameter $\lambda$ in the regularization terms of the error function. What…
cgo
  • 7,445
  • 10
  • 42
  • 61
19
votes
2 answers

Advantages of Particle Swarm Optimization over Bayesian Optimization for hyperparameter tuning?

There's substantial contemporary research on Bayesian Optimization (1) for tuning ML hyperparameters. The driving motivation here is that a minimal number of data points are required to make informed choices about what points are worthwhile to try…
Sycorax
  • 76,417
  • 20
  • 189
  • 313
18
votes
2 answers

How to use XGboost.cv with hyperparameters optimization?

I want to optimize hyperparameters of XGboost using crossvalidation. However, it is not clear how to obtain the model from xgb.cv. For instance I call objective(params) from fmin. Then model is fitted on dtrain and validated on dvalid. What if I…
Klausos
  • 499
  • 1
  • 6
  • 11
17
votes
3 answers

Hyper parameters tuning: Random search vs Bayesian optimization

So, we know that random search works better than grid search, but a more recent approach is Bayesian optimization (using gaussian processes). I've looked up a comparison between the two, and found nothing. I know that at Stanford's cs231n they…
Yoni Keren
  • 526
  • 1
  • 3
  • 13
1
2 3
37 38