-1

I'm stuck with some seemingly easy task to fit nonlinear regression model. It worked normally until some new data came. Here is my code:

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import leastsq
from scipy.optimize import curve_fit

x = [1, 8, 20, 40, 80, 125, 250, 500, 2500, 20000]


def f(x, a, b, c, d):
    """ Function itself. """
    return d + (a - d) / (1.0 + np.exp(b * (np.log(x) - np.log(c))))


def err(cfs, x, y):
    """ Error function for 'leastsq'. """
    y_hat = f(x, *cfs)
    res = y_hat - y
    return res


def synth(x, params):
    """ Try some fitting on synthetic data. """
    y = f(x, *params)
    cfs, cov = curve_fit(f, x, y)
    full_res = leastsq(err, np.ones_like(params), args=(x, y), full_output=True)
    y_hat1 = f(x, *cfs)
    y_hat2 = f(x, *full_res[0])
    plt.plot(np.log(x), y, 'bo')
    plt.plot(np.log(x), y_hat1, 'r')
    plt.plot(np.log(x), y_hat2, 'go')
    plt.grid(True)
    plt.show()


def test():
    synth(x, [40000.0, 2.0, 150.0, 4000.0])


if __name__ == '__main__':
    test()

This is synthetic data, but other data I have looks very similar (and doesn't fit either), so I want to understand why that happens and extrapolate understanding to other data sets.

You can see that I tried two minimizers - "curve_fit" and "leastsq". I've also tried "fmin" (Nelder-Mead method), but it didn't help as well.

P.S. I know I can transform it taking log(y), but I wanted to get result on data "as is".

Learner
  • 1,528
  • 3
  • 18
  • 34

2 Answers2

0

After reading this post, I've decided to make a naive regularization on my function. For now, it works:

def err(cfs, x, y):
    """ Error function for 'leastsq'. """
    y_hat = f(x, *cfs)
    res = y_hat - y
    # add penalty to cases when solver is trying cases
    # where lower asymptote (d) is greater than upper asymptote (a)
    # and where inflection point (c) <= 0
    penalty_asymptote = 1000.0 * np.fabs(cfs[0] - cfs[3]) if cfs[0] < cfs[3] else 0
    penalty_inflection = 1000.0 * np.fabs(cfs[2]) if cfs[2] <= 0.0 else 0
    return res - penalty_inflection - penalty_asymptote

I don't know yet how to deal with function for curve_fit, but that's out of my interested, because I needed a solution for leastsq only.

0

This may be a problem with the initial parameters used by curve_fit() which are all 1.0 by default, and if so here is a way to find out.

scipy.curve_fit allows the user to pass in initial parameters, so if you pass in initial parameters that are close to optimal and the fitting succeeds, this is the problem. Try this:

cfs, cov = curve_fit(f, x, y, p0 = [40001, 3, 151, 4001]) # add 1.0

as a test. I tried this and it worked.

James Phillips
  • 1,158
  • 3
  • 8
  • 7