I am performing $k$-fold cross-validation for model selection via maximum-likelihood estimation.
At the moment I am using a standard optimizer (fmincon in MATLAB). Since the likelihood landscape is complex with a myriad of local minima, I need to restart the optimization hundreds or even thousands of times to explore the landscape and find what seems to be the global optimum. For this reason, I was trying to find methods to improve the efficiency of the optimization (see also this post for an orthogonal attempt at obtaining a speed-up).
My strategy so far is to restart the optimizer from different starting points chosen from a space-filling quasi-random SOBOL sequence - the logic is pretty much the same as having a Latin hypercube design, but it is easy to add new points to the sequence. This strategy is dumb in that it treats each restart as independent and does not use any information about the objective function acquired during previous optimizations.
I know that there are "semi-smart" multi-start methods for global optimization that choose the starting points in an adaptive way. For example, some multi-start methods somehow estimate the basin of attraction of the function minima and avoid starting from an existing basin. I have skimmed some of the papers but I am not sure how effective these methods would be in practice.
Does anybody have experience with some kind of multi-start method, and can recommend it?
Just to prevent some answers: I've tried other local and semi-local optimizers, such as CMA-ES, to no good. I've also tried some global optimizers, such as MCS, to no good. I am pretty familiar with a variety of local and global optimizers (see my answer to this post), but I am open to new suggestions. Bayesian Optimization (see my answer to this post) is not on the plate for this problem, though, because BO is way too slow.