I have a black box loss function that is evaluated by an external stimulator. It accepts two vectors $x$ and $y$ , $L(x,y)$. I have the freedom to choose $y$ for a given $x$. Therefore, I would like to pick $y$ that minimizes $L(x,y)$. If we model the mapping from $x$ to the optimal $y$ by a neural network producing $y=f(x)$, I would like to iteratively improve $f$ by a training process by evaluating $L$ as little as possible.
I am wondering what would be the best method to do this training.
I am aware of the cross-entropy method which improves by iteratively choosing the elites over the samples. I am wondering if there is any other method that I must consider before working on the implementation.