problem with a non-linear minimization with nlm in R

Question

I am struggling with a non-linear minimization, and I wonder if anyone could advise / point me to some literature, please.

In short: I have some pharmacokinetic data (plasma concentration vs time points, $C(t)$, following oral administration of a discovery molecule); models are known in the literature, which predict that the concentration will be a sum of exponentials, like $ C(t)= \frac {e^{-a \cdot t}-e^{-(b+c) \cdot t} } {1+a+ \frac c b} $; I made a cost function based on a model of this type (the cost function uses exponentials and atan functions to constrain the parameters to physiologically relevant ranges), found reasonable estimates for the parameters and ran the minimization.

Although the method works in many cases, occasionally some parameters seem to 'escape' to very large or very small values (without generating errors, i.e. the experimental points and calculated ones do match well), where it doesn't make almost any difference if you change them.

For instance, in $C(t)$ above, if $a=0.7, b=7.7, c=0.01$, increasing $b$ only makes a significant difference in the initial part of the curve (small $t$). I suspect this means that if there are large experimental errors in the early concentrations this means that $b$ (and maybe $c$ too?) can't be determined accurately. In the same example, $c$ could be changed by several orders of (log) magnitude without much change at all in the curve or residual sum of squares.

Parameters $(a, b, c)$ have a precise physiological meaning, in particular they are related to transfer, absorption, elimination rate constants. Not having an accurate value for them is quite bad for instance when one wants to compare different molecules. Suppose parameter $b$ represents the absorption rate. If I run the minimization for 2 molecules X and Y, and I find $b_X = 10, b_Y=1$, can I confidently conclude that X is absorbed more rapidly than Y, if I then notice that both parameters can be varied within large ranges without affecting the respective minimization objectives?

In conclusion: do you think what I am observing is an intrinsic, inevitable feature of this type of non-linear function and data? If not, do you know of any alternative method (in R) that would circumvent this sort of low sensitivity' issue (if that's what it is), and preferably also give standard errors on the parameters' estimates?

So far I looked at nls, which however doesn't seem to allow much control in the function definition. I like FME, which also does sensitivity analysis, but that would force me to use the parent differential forms.

[EDIT: Adding data and more details on the functions used.]

Here's the R script:

CL <- 4.8188619
Vp <- 1.2063197
a1 <- 7.2481778
a2 <- 0.8486741
kU <- 2.562288
kE <- 1.539883
fu <- 0.8537
LBF <- 5.1
BP <- 1
dfpo <- data.frame(t1 = c(0,0.25,0.5,1,3,6,24), c1 = c(0,158.333333333333,98.1,69.2666666666667,27.6333333333333,5.41,0))
dose_po <- 5000
C_pl_po <- function(p,t1) {a1=a1; a2=a2; EH=(min(CL/(LBF*BP),1))*(1/2+atan(p[1])/pi); M=dose_po*(1-EH); TG=0.25; Tex=36; fs=1/2+atan(p[2])/pi; TP=exp(p[3]); TGI=exp(p[4]); B1=(fs*M/Vp)*(kE-a1)/((TG*a1-1)*(TP*a1-fs-fs*TP/TGI-TP/Tex)*(a2-a1)); B2=-(fs*M/Vp)*(kE-a2)/((TG*a2-1)*(TP*a2-fs-fs*TP/TGI-TP/Tex)*(a2-a1)); B3=(fs*M/Vp)*(TG*kE-1)/((TG*a2-1)* (TG*a1-1)* (fs+fs*TP/TGI -TP/TG+TP/Tex)); B4=-(fs*M/Vp)*(TP*kE-fs-fs*TP/TGI-TP/Tex)/((TP*a2-fs-fs*TP/TGI-TP/Tex)* (TP*a1-fs-fs*TP/TGI-TP/Tex)* (-1+fs*TG/TP+fs*TG/TGI+TG/Tex)); (B1*exp(-a1*t1)+B2*exp(-a2*t1)+B3*exp(-t1/TG)+B4*exp(-t1*(fs/TP+fs/TGI+1/Tex))) }
TG_approx <- 0.25
Tex_approx <- 36
TGI_approx <- 100
tmax_PO_approx <- (dfpo$t1[dfpo$c1==max(dfpo$c1)])[1]
cost_tmax_po <- function(pTP) {TP=exp(pTP); (tmax_PO_approx-log(Vp/CL/TP)/(1/TP-CL/Vp))^2}
TP_approx <- exp(nlm(cost_tmax_po,0)$estimate)
fs_approx <- 0.5
lpo <- length(dfpo$c1)
AUC_PO_approx <- sum((dfpo$t1[2:lpo]-dfpo$t1[1:(lpo-1)])*(dfpo$c1[2:lpo]+dfpo$c1[1:(lpo-1)])/2)
M_approx <- AUC_PO_approx*CL*(1+TP_approx/TGI_approx+TP_approx/(fs_approx*Tex_approx))
if (M_approx >= dose_po) {M_approx = dose_po*0.99}
EH_approx <- 1-M_approx/dose_po
p0po <- c(tan(pi*(EH_approx/min(CL/(LBF*BP),1) -1/2)),tan(pi*(fs_approx-1/2)),log(c(TP_approx,TGI_approx)))
cost_po <- function(p) {sum((dfpo$c1-C_pl_po(p,dfpo$t1))^2)}
mpo <- nlm(cost_po,p0po,iterlim = 1000)
ps_po <- mpo$estimate
parspo <- function(p) { a1=a1; a2=a2; EH=(min(CL/(LBF*BP),1))*(1/2+atan(p[1])/pi); M=dose_po*(1-EH); TG=TG_approx; Tex=Tex_approx; fs=1/2+atan(p[2])/pi; TP=exp(p[3]); TGI=exp(p[4]); B1=(fs*M/Vp)*(kE-a1)/((TG*a1-1)*(TP*a1-fs-fs*TP/TGI-TP/Tex)*(a2-a1)); B2=-(fs*M/Vp)*(kE-a2)/((TG*a2-1)*(TP*a2-fs-fs*TP/TGI-TP/Tex)*(a2-a1)); B3=(fs*M/Vp)*(TG*kE-1)/((TG*a2-1)* (TG*a1-1)* (fs+fs*TP/TGI -TP/TG+TP/Tex)); B4=-(fs*M/Vp)*(TP*kE-fs-fs*TP/TGI-TP/Tex)/((TP*a2-fs-fs*TP/TGI-TP/Tex)* (TP*a1-fs-fs*TP/TGI-TP/Tex)* (-1+fs*TG/TP+fs*TG/TGI+TG/Tex)); c(B1=B1,B2=B2,B3=B3,B4=B4,a1=a1,a2=a2,EH=EH,fs=fs,TG=TG, TP=TP, TGI=TGI,Tex=Tex)}
pars_po <- parspo(ps_po)

Now, if you look at the nlm object 'mpo', it says a minimum of 1294.544 was found, with exit code 1, which should be the best. The PO parameters in 'pars_po' however have unreasonable values. In particular TP, which should be the reciprocal of the absorption rate constant, is 1.05e-10, meaning absorption would be enormously fast. Although the fit looks reasonable enough:

ts1 <- seq(0,12,by=0.01)
plot(ts1,C_pl_po(ps_po,ts1),type="l",col=2)
points(dfpo)

But if you change the value of TP by several orders of magnitude, the curve looks almost the same:

lines(ts1,C_pl_po(c(ps_po[1],ps_po[2],-4,ps_po[4]),ts1),col=4)
cost_po(c(ps_po[1],ps_po[2],-4,ps_po[4]))

ps_po[4] (which maps to TGI) can also be increased essentially indefinitely without affecting the curve at all (this is normal, however, given the model). I would have expected ps_po[3] (which maps to TP) to be better determined, as it is supposed to influence the shape of the curve quite heavily. It appears instead that once TP becomes small enough, TGI can take almost any value, and TP itself can vary in a large range without much change.

What do you think? Am I doing something wrong, can I do better, or is this type of function doomed to give such bad sensitivity on some parameters?

Providing details would help, especially concerning how you are computing the fit. In general, linear combinations of exponentials are difficult to estimate precisely. — whuber, Oct 17 '17 at 20:31
@user6376297 I agree with whuber; as soon as I read as far as the word "pharmacokinetic" my first thought was "This is going to be a difficulty with fitting linear combination of exponentials, right?" -- it's difficult in large part because the resulting log-likelihood tends to have a very long ridge (i.e. the least squares loss function has a long valley) in the parameter space -- and your mention of "doesn't make any difference if you change them" is exactly the issue. You can make it choose a sensible region (eg. through a prior but it's your prior rather than the data that fixes the value) — Glen_b, Oct 17 '17 at 21:59
Thank you both for your input! I will edit my question adding some data and the actual functions I used. But I seem to gather from your comments that it may indeed be an inevitable feature of this kind of non-linear function. I wonder if FME coupled with deSolve would work better then, as in that case the parameters are embedded in linear differential equations... in theory the function that comes out of the differential system is the same (sum of exponentials), but as the system is integrated numerically... — user6376297, Oct 18 '17 at 08:55
Pursuing this optimization to the bitter end may be counterproductive. You might get more value in examining the variance-covariance matrix of the coefficient estimates: that will indicate how uncertain the estimates (or combinations of the estimates) are and thereby help you decide whether it might be worthwhile working harder to improve the solution. — whuber, Oct 18 '17 at 13:48
Thank you @whuber ; do you mean like [here](https://stats.stackexchange.com/questions/27033/in-r-given-an-output-from-optim-with-a-hessian-matrix-how-to-calculate-paramet) ? Or do you know some other reference for the technique you're referring to? — user6376297, Oct 18 '17 at 16:59
Yes, that's a good link. And a nice object lesson in being too precise can be found at https://stats.stackexchange.com/questions/113314, for instance. — whuber, Oct 18 '17 at 19:57
Thanks. I tried the Hessian method with the model I described above and it gave some positive and some negative values on the diagonal :( . Then I developed another model that fits the data much better, and then indeed the diagonal has much smaller values, and most of them are positive, but there are still a couple of negative ones :( . I guess I will try to hammer my model into nls , which does all the dirty work for you ... — user6376297, Oct 19 '17 at 14:04
Just an update for future reference. As the Hessian wasn't helping, I turned again to non-linear regression methods. nls is frankly hard to work with, always this crazy singular gradient popping up. So I tried 'nlxb' from package [nlmrt](https://cran.r-project.org/web/packages/nlmrt/index.html). This looks very promising, it calculates coefficients, standard errors and other stats from any arbitrary non-linear formula. I still have to see some details, but definitely a step forward. In fact I will make another post about that. Thank you all again for your comments. — user6376297, Oct 20 '17 at 17:08

problem with a non-linear minimization with nlm in R

0 Answers0