0

I am trying to use bfgs algorithm in order to fit a set of $\{(x,y),f(x,y)\}$ to a function in the form of let's say $a\cdot cos(x)+b \cdot y=f(x,y)$.

I try to understand how to use bfgs algorithm with liblbfgs, but I don't understand the example and it is not clear what function the author tried to fit.

0x90
  • 687
  • 1
  • 5
  • 17

2 Answers2

1

The example is doing a 100-dimensional (see #define N 100 in the code) optimization. The author is only printing the first two dimensions of x = (x[0],x[1],...,x[N]) as shown below for iteration 1.

Iteration 1:
fx = 254.065298, x[0] = -1.069065, x[1] = 1.053443
xnorm = 10.612828, gnorm = 325.365479, step = 0.000607

Now f(x) is defined in the function evaluate.

for (i = 0;i < n;i += 2) {
    lbfgsfloatval_t t1 = 1.0 - x[i];
    lbfgsfloatval_t t2 = 10.0 * (x[i+1] - x[i] * x[i]);
    g[i+1] = 20.0 * t2;
    g[i] = -2.0 * (x[i] * g[i+1] + t1);
    fx += t1 * t1 + t2 * t2;
}

fx is for the function value at $x$ and $g(x)$ is a $N\times 1$ (or $100\times 1$) dimensional gradient.

\begin{align} f(x) = \sum_{i=0,2,4,...,N-2}(1-x_i)^2 + \left(10(x_{i+1} - x_i^2)\right)^2 \end{align} The gradient at odd components ($i=1,3,5,...$) is $$200(x_{i+1} - x_i^2) $$ The gradient component at even coordinates ($i=0,2,4,6,...$) is $$-2(1-x_i) - 400*x_i*(x_{i+1} - x_i^2)$$

Note:

  1. indexing starts from 0
  2. I believe the gradients in the code are incorrect. When I change the appropriate line g[i+1] = 20.0 * t2; to g[i+1] = 200.0 * t2; I am getting a different answer. Potentially I may be making a mistake here. Nonetheless, hopefully I have answered your question.

Our fitting problem In our case, we have a two dimensional problem. Rename our $f(x,y)$ to $z$. Then, we have an $m\times 3$ dimensional matrix of values with each row being a tuple $(x_j,y_j,z_j), j=1,...,m$ which are fixed. We could now minimize the function $h(a,b)$ \begin{align} h(a,b) = \sum_{j=1}^{m}(a\cos(x_j) +b y_j - z_j)^2 \end{align} With \begin{align} \frac{\partial h(a,b)}{\partial a} = -2\sum_{j=1}^{m}\left((a\cos(x_j) +b y_j - z_j)\sin(x_j)\right)\\ \frac{\partial h(a,b)}{\partial b} = 2\sum_{j=1}^{m}\left((a\cos(x_j) +b y_j - z_j)y_j\right) \end{align} as the gradient functions. All that you need to do is encode these in place of the for loop above, change #define N 100 to 2 and initialize some initial value of $a,b$ to be passed into the lbfgs function.

  • Any suggestion of how to initialize the `lbfgs_malloced` array? – 0x90 Oct 18 '13 at 01:02
  • Since the documentation notes *A user does not have to use this function for libLBFGS built without SSE/SSE2 optimization*, I assume you want to have SSE optimization? Hmm, I have not explored it. – Theja Tulabandhula Oct 18 '13 at 02:26
0

This is a library I am currently using in a project, and I wanted to add some input to this in case - I cannot add a comment with out a reputation of 50.

I believe that the creator of this library was using the Rosenbrock function as a test, this is a common function used to test unconstrained optimization problems.

Here's a Wikipedia article on the function. https://en.wikipedia.org/wiki/Rosenbrock_function

It's important to note, that for your own application, you need to create your own evaluation function to calculate the gradient of your function at that point.

Max Chandler
  • 101
  • 1