I am trying right now to do a 10 fold cross validation cycling through some different combinations of hyper parameters in the XGBOOST algorithm. I know this is a very complicated process with many different variables in play. I am looking at max_depth between 5 and 10 and learning rates (eta) 0.01, 0.00082, 0.00064, 0.00046, 0.00028. This yields 30 different combinations of hyper parameters and it has to be run on all 10 folds of the cross validation, so 300 total times. XGBOOST usually takes hundreds of boosting rounds (n_estimators/nrounds) before the early stop will trigger. I have 2 computers that have been grinding away at this for 7 hours and it's only about half way finished. It takes forever!!!! After all this the model might not even perform well in the cross validation let alone on the testing data and I will have to do it all over again.
Surely there is a way to speed up this process aside from using a cloud computing cluster. If I jack up the learning rate it will be a little faster, but still very slow due to the large number of n_estimators. If I make n_estimators very low and use learning rates such as 0.1-0.3 it's not actually telling me if the appropriate hyperparameters are any good. All of this doesn't even factor in all the other hyperparameters XGBOOST has to offer. What techniques do you guys use to speed up this process? Or is there no alternative and it just takes forever no matter what?