0

I have a reproducible example here with an attempt to use nls to fit a nonlinear function:

y = ax/(b+x) + c

Even when I set the starting values to be a good, known optimization, the nls call fails with singular convergence. When I restrict the bounds to be right around the known optimization, it succeeds. Of course I won't always know the upper and lower bounds in all cases of these types of curves. How can I improve the original nls call to help find the optimized solution?

X = c(5.1,5.3,5.4,5.6,5.7,5.8,6.0,6.3,6.5,6.8,7.0,7.3,7.3,7.6,8.2,8.5,8.5,8.8,8.8,9.1,9.3,9.3,9.8,10.2,10.7,11.1,11.1)
Y = c(118,116,115,114,115,115,114,114,114,112,108,107,111,109,107,108,109,105,105,107,107,107,106,105,104,101,101)
plot(X,Y,xlim=c(3,12),ylim=c(98,120))

#KNOWN, GOOD STARTING VALUES
a = 12
b = -18
c = 119
newx = seq(-100,100,by=0.01)
newy = newx*a/(b+newx)+c
lines(newx,newy)

#Gives singular convergence
myfit = nls(Y~a*X/(b+X)+c,data=NB,start=list(a=a,b=b,c=c),
            control=nls.control(maxiter = 10000),algorithm="port")

#Finds solution if I put in the lower and upper bounds
myfit = nls(Y~a*X/(b+X)+c,data=NB,start=list(a=a,b=b,c=c),
            control=nls.control(maxiter = 10000),
            lower=c(8,-20,110),upper=c(12,-15,130),algorithm = "port")

enter image description here

CodeGuy
  • 453
  • 3
  • 7
  • 15
  • 1
    In what sense are these "known" and "good" starting values? The fit looks bad (examine the residuals if you don't believe me). I suspect a much better fit uses completely different parameter values. Indeed, it looks like the points are nearly linear with a slope close to $-2.5,$ so I tried starting values $(a,b,c)=(-2500,1000,130)$ and had no problems at all--even though these values are far from the final ones. My residual sum of squares is just 49, almost half of yours, demonstrating the improvement. BTW, you need an "NB" object for your code to work. – whuber Jun 27 '19 at 21:24
  • Sure, the fit looks bad. Of course there can be a better solution. But the solution I provide IS a solution, and yet NLS as it stood could not find it. Why not? How can I improve the NLS code so that it has a better chance of finding a solution? – CodeGuy Jun 28 '19 at 12:39
  • In what sense is it a solution, since it manifestly does not minimize the objective function? The duplicate explains various ways to improve your chances of finding solutions. – whuber Jun 28 '19 at 12:53
  • If it might be of some use, my equation search turned up a simple two-parameter scaled power equation that gives an approximately equal fit to your data, the equation is "y = a * pow(x, b)" with parameters a = 1.5416840673681483E+02 and b = -1.6948730387033209E-01 yielding R-squared = 0.9151 and RMSE = 1.349 – James Phillips Jun 28 '19 at 13:19
  • @James Given that you must have explored many possible models, that reduction of RMSE from 1.434 to 1.349 probably should be treated as insignificant. In a context where the OP has a reason to adopt a particular model, one ought to strongly prefer their formulation. – whuber Jun 28 '19 at 14:29
  • @whuber please note that the model I mention in my comment has only two parameters, rather than the original three. I would agree with you if the model from my equation search also had three parameters, but this is not the case. Parsimony itself recommends preference for the model in my comment, given that the fit statistics are approximately the same. For this reason I respectfully disagree in this specific instance. – James Phillips Jun 28 '19 at 15:36
  • @James "Parsimony" does not refer to the number of parameters you end up with: it refers to how many you considered while getting there. Since you haven't disclosed that, we have to presume you used some kind of generalized curve-fitting procedure; many of those consider hundreds if not thousands of distinct models and allow for arbitrarily many parameters. Absent any documentation of how you arrived at your fit, we have to reject it as insignificant. – whuber Jun 28 '19 at 15:38
  • @whuber I limited the initial equation search to two parameters only, so there is no need for such rejection as I understand it. – James Phillips Jun 28 '19 at 15:40
  • @James But how many different models? That matters--a lot. – whuber Jun 28 '19 at 17:03
  • @whuber 142 non-linear equations and 263 linear equations, each with two or less parameters. The equation in my comment was the top result. The site is recommending moving this discussion to chat. – James Phillips Jun 28 '19 at 17:52
  • @whuber it is a solution in that it is A fit. ANY fit is a solution. I am not trying to determine the absolute best fit, I am simply stating....the code I showed did not even find A solution (albeit a good or bad solution). Make sense? – CodeGuy Jun 29 '19 at 13:46

0 Answers0