The example is doing a 100-dimensional (see #define N 100 in the code) optimization. The author is only printing the first two dimensions of x = (x[0],x[1],...,x[N]) as shown below for iteration 1.
Iteration 1:
fx = 254.065298, x[0] = -1.069065, x[1] = 1.053443
xnorm = 10.612828, gnorm = 325.365479, step = 0.000607
Now f(x) is defined in the function evaluate.
for (i = 0;i < n;i += 2) {
lbfgsfloatval_t t1 = 1.0 - x[i];
lbfgsfloatval_t t2 = 10.0 * (x[i+1] - x[i] * x[i]);
g[i+1] = 20.0 * t2;
g[i] = -2.0 * (x[i] * g[i+1] + t1);
fx += t1 * t1 + t2 * t2;
}
fx is for the function value at $x$ and $g(x)$ is a $N\times 1$ (or $100\times 1$) dimensional gradient.
\begin{align}
f(x) = \sum_{i=0,2,4,...,N-2}(1-x_i)^2 + \left(10(x_{i+1} - x_i^2)\right)^2
\end{align}
The gradient at odd components ($i=1,3,5,...$) is
$$200(x_{i+1} - x_i^2) $$
The gradient component at even coordinates ($i=0,2,4,6,...$) is
$$-2(1-x_i) - 400*x_i*(x_{i+1} - x_i^2)$$
Note:
- indexing starts from 0
- I believe the gradients in the code are incorrect. When I change the appropriate line g[i+1] = 20.0 * t2; to g[i+1] = 200.0 * t2; I am getting a different answer. Potentially I may be making a mistake here. Nonetheless, hopefully I have answered your question.
Our fitting problem
In our case, we have a two dimensional problem. Rename our $f(x,y)$ to $z$. Then, we have an $m\times 3$ dimensional matrix of values with each row being a tuple $(x_j,y_j,z_j), j=1,...,m$ which are fixed. We could now minimize the function $h(a,b)$
\begin{align}
h(a,b) = \sum_{j=1}^{m}(a\cos(x_j) +b y_j - z_j)^2
\end{align}
With
\begin{align}
\frac{\partial h(a,b)}{\partial a} = -2\sum_{j=1}^{m}\left((a\cos(x_j) +b y_j - z_j)\sin(x_j)\right)\\
\frac{\partial h(a,b)}{\partial b} = 2\sum_{j=1}^{m}\left((a\cos(x_j) +b y_j - z_j)y_j\right)
\end{align}
as the gradient functions.
All that you need to do is encode these in place of the for loop above, change #define N 100 to 2 and initialize some initial value of $a,b$ to be passed into the lbfgs function.