Least Square Curve Fitting, finding the initial start values in lsqcurvefit function in MATLAB

Question

My question is regarding MATLAB and nonlinear Least Squares curve fitting in MATLAB - in both I am not familiar with.

I have this type of data:

x = [600, 800, 1000, 1200, 1400];
y = [0, 0.2, 0.4, 0.7, 1];

I am trying to use the following algorithms:

f = @(p,x) (p(3)-p(4))./(1+exp(-(x-p(2))/p(1)))+p(4);
opts = optimset('Display','off','MaxFunEvals',1000);
sigfit = lsqcurvefit(f, starting_value, intervals,problong,[],[],opts);
bisection_point= sigfit(2)-sigfit(1)*log((sigfit(3)-0.5)/(0.5-sigfit(4)))

The only problem I have is the starting value in sigfit variable. What would be the best starting values given the above numbers? Any help please would be extremely appreciated.

The above algorithms are based in lsqcurvefit function found in MATLAB.

The X vector is time intervals in milliseconds, whereas the Y vector represents responses some participant made whether those intervals where perceived as close to a short (400 ms) or long (1600 ms) interval. I don't understand what the starting points mean. Ultimately, what I need to do is find the 0.5 point in the Y axis and the corresponding value on the X axis. The solution will be somewhere between 600 ms to 1400 ms and probably around the 1200 ms mark. I have put the starting value as a vector from 600 to 1400, but I have no idea whether that is right, or what that means. I was hoping someone better equipped than me can help answer this problem precisely :).

At http://stats.stackexchange.com/a/160575/919 I have posted general advice concerning such questions. — whuber, Oct 20 '16 at 18:37
Hi, I looked into that thread, I still have no idea. I dont understand what starting points mean. Ultimately what I need to do is find the 0.5 point in the Y axis and the corresponding value on the X axis. The solution will be somewhere between 600ms to 1400 ms and probably around the 1200ms mark. I have put the starting value as a vector from 600 to 1400 but, I have no idea whether that is right or what that means. I was hoping someone better equipped than me can help answer this problem precisely :). — Parmenides, Oct 20 '16 at 18:58
You will greatly increase your chances of getting a specific response by (1) using standard terminology (math and English) rather than matlab code to specify the function, so that everybody will understand immediately what it means, and (2) formatting it clearly. — whuber, Oct 20 '16 at 20:32
Hi whuber, thank you. I accepted the re-formating someone kindly did. I am not sure yet how to format the text in this forum - I am new here. About the question, I did give the matlab code, but I also explained the question, didn't I? — Parmenides, Oct 20 '16 at 20:50
You have 4 parameters and 5 data points, which is not really enough data to reliably estimate the parameters. However your stated goal seems to be more about interpolating the $(x,y)$ points than anything to do with the parameters. Why not just fit a line? (Assuming you intended decimal points in the 2nd and 3rd $y$ data values, the data is pretty linear.) — GeoMatt22, Oct 20 '16 at 21:16
You will have an awful time trying to fit this logistic-looking function to the data you gave. Perhaps "02" and "04" are intended to be "0.2" and "0.4"? — whuber, Oct 20 '16 at 21:21
GeoMat, you correct. However, I need to interpolate the data to get to the bisection point. The line is more sigmoid than linear for many other participants - the data here are real responses (y vector) from a participant. — Parmenides, Oct 20 '16 at 21:46
Again, my question is regarding the starting point to fit this type of model. What does that mean? Can someone explain this to me please? Can I just put any random number in? It doesn't sound feasable. Do I need to put numbers related to the x axes? y axes? related to the likely slope? intercept? This is what I am not getting :). — Parmenides, Oct 20 '16 at 21:49
To specifically answer your question, your optimization variable vector p has length 4, and you have a non-optimization scalar parameter x. The starting value must be a vector of length 4, consisting of your initial "guess" for the values of the 4 elements of p. lsqcurvefit will then iterate to try to find the optimal value of the elements of p. How to come up with a good starting value is another less straightforward matter. — Mark L. Stone, Nov 08 '16 at 18:07

dave fournier · Answer 1 · 2016-11-08T17:48:32.527

Here is the solution to your problem. I rescaled the time in seconds to make things a bit easier. I found it using the parametrization I described in other posts for different curves. For the logistic the idea is that you replace the lower and upper asymptotes with the predicted Y values for the smallest and largest observed x values.They are Y1 and Yn in the list below.

LA and UA are the lower and upper asymptotes. The x value you are interested in is xm in the list with a value of 1.0678e+01. The std dev of this dependent variable is calculated by the delta method and is 1.0107e-01. For the original data the estimate would be 1067.8 with a std dev of 10.107.

The code I used was written in AD Model Builder but this could be done easily in R (I think, but nls can be extremely unstable for difficult problems like this.)

 index   name   value      std.dev
   1   w1     4.0651e-03 1.0339e-02
   2   wn     2.0121e-03 1.0305e-02
   3   a      1.5877e+01 6.3858e+00
   4   log_b -1.6922e+00 5.3681e-01
   5   LA    -5.0287e-01 3.2502e-01
   6   UA     3.1281e+00 2.5605e+00
   7   Y1     4.0651e-03 1.0339e-02
   8   Yn     1.0020e+00 1.0305e-02
   9   b     -1.8412e-01 9.8837e-02
  10   binv  -5.4313e+00 2.9156e+00
  11   xm     1.0678e+01 1.0107e-01

I have been asked to explain why there are so many parameters in the table. When AD Model Builder produces its std dev report the first parameters are the independent variables in the model. For this model there are four parameters. Y1 is the predicted observation for x_1. The obvious starting value for Y_1 (note subscript) which is the observed Y value for x_1. In the model this is parametrized as

 Y1 =  w1 + Y_1

so that w1 is the independent variable. So w1 has an obvious initial value of zero. Doing it this way has minor technical advantages. Same story for wn where

 Yn = wn + Y_n

a is the usual parameter a in the logistic (see below) and log_b is the log of -b so that b=-exp(log_b). The logistic is sometimes parametrized as exp(b(x-a)) and sometimes as exp((x-a)/b) so binv=1.0/b is included. Finally xm is the x value which produces the middle Y observation. Rethinking that I'm not sure whether one wants the value halfway between the observed Y's or halfway between the predicted Y's. UP above I did the value halfway between the predicted Y's. Here is the value halfway between the observed Y's.

 11  xm   1.0655e+01 6.5668e-02

Not very different but the std dev is smaller. The formula for the logistic in terms of the LA, UA, a, and b is

  LA+(UA-LA)/(1+exp(b*(x-a)))

Anway my point is that there is a general way to parameterize these kinds of models. However I seem to be unable to get this point across. For some reason people don't seem to understand that I have solved their problem with this kind of re-parametrization. Since a lots of people are familiar with the layout and report format from R's nls(2) I have taken my parameter estimates and used them as the starting values for nls2. As you can see this is already converged so it is the solution.

library(nls2)
cl<-data.frame(x =c(600, 800, 1000, 1200, 1400),
   Y =c(0, 0.2, 0.4, 0.7, 1) )

logistic <- function(x,LA,UA,a,b) {LA+(UA-LA)/(1+exp(b*(x-a)))}


nls2.m<- nls2(Y ~ logistic(x,LA,UA,a,b), data = cl,
  start = list(LA = -5.0287e-01, UA= 3.1281e+00, a=1.5877e+03,
  b = -1.8412e-03 ) )

Formula: Y ~ logistic(x, LA, UA, a, b)

Parameters:
   Estimate Std. Error t value Pr(>|t|)
   LA -5.029e-01  7.108e-01  -0.707    0.608
   UA  3.128e+00  5.651e+00   0.554    0.678
   a   1.588e+03  1.410e+03   1.126    0.462
   b  -1.841e-03  2.171e-03  -0.848    0.552

Residual standard error: 0.02322 on 1 degrees of freedom

Number of iterations to convergence: 1 
Achieved convergence tolerance: 7.227e-06

You might find this more difficult example interesting. There is R code using nls2. The model is

  Y = exp(b0*exp(b1*x^t)

Using the same ideas as here nls2 converged in less than 10 iterations starting from a standard starting point much like the above example. I then changed the final values by less than 15% each and nls2 immediately crashed using the standard parametrization. Here is the link.

https://stat.ethz.ch/pipermail/r-help/2016-October/442729.html

Could you indicate how your table might be related to the small dataset of just *five* $(x,y)$ pairs given in the question? What do all those "name" values mean with respect to the information explicitly given in the question? — whuber, Nov 07 '16 at 18:33

Least Square Curve Fitting, finding the initial start values in lsqcurvefit function in MATLAB

1 Answers1

Related