I am using ADAM as an optimization algorithm to minimize some black box function $f(x,y)$. I know this function is convex and has a minimum $f(5,5) = 0$.
Initially, the algorithm proceeds as expected:
where the $x$ and $y$ axis are the respective variables, and the red dot indicates the target of $x=5, y=5$
However at small values of $f(x,y)$ I get,
This is a problem if I want convergence to zero within some precision $\epsilon \ \sim 10^{-12}$.
What is the cause of oscillations like this?
What is a solution to avoid/compensate for them?