Initial conditions for nonlinear models using the nlsLM function

Question

I have a data frame containing 70000 rows. For each row, I am trying to apply the nlsLM function (minpack.lm package) to find the values of a, e and c parameters that give the best fit to the data. Among the 70000 rows, 3412 rows failed with the error message: Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimates. I don’t know why the initial conditions are not correct. To better define the initial conditions, I have added some conditions based on the linear model:

intercept = a*k1*k2 + a*c
c = intercept/a - k1*k2  and c>= 0
slope = a*e
e = slope/a and e >= 0
a >= intercept / k1*k2 and a > 0

Here is the code with a reproducible example:

Y <- structure(list(d = c(0, 0, 1, 0, 0.200000002980232, 0, 0, 1, 
                           0.200000002980232, 0.200000002980232), TR = c(0.000166487283422612, 
                                                                         5.28295131516643e-05, 0.148508191108704, 0.321905970573425, 0.160190761089325, 
                                                                         0.456221789121628, 0.47410836815834, 0.529531538486481, 0.0595778152346611, 
                                                                         0.464331150054932), k1 = c(3000, 3000, 3000, 3000, 3000, 3000, 
                                                                                                    3000, 3000, 3000, 3000), k2 = c(300, 300, 300, 300, 300, 300, 
                                                                                                                                    300, 300, 300, 300), n = c(3000, 3000, 3000, 3000, 3000, 3000, 
                                                                                                                                                               3000, 3000, 3000, 3000), na = c(1000, 1000, 1000, 1000, 1000, 
                                                                                                                                                                                               1000, 1000, 1000, 1000, 1000), nd = c(100, 100, 100, 100, 100, 
                                                                                                                                                                                                                                     100, 100, 100, 100, 100), nr = c(30, 30, 30, 30, 30, 30, 30, 
                                                                                                                                                                                                                                                                      30, 30, 30), b = c(100, 100, 100, 100, 100, 100, 100, 100, 100, 
                                                                                                                                                                                                                                                                                         100), a = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
                                                                                                                                                                                                                                                                                                     NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), e = c(NA_real_, 
                                                                                                                                                                                                                                                                                                                                                              NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
                                                                                                                                                                                                                                                                                                                                                              NA_real_, NA_real_), c = c(NA_real_, NA_real_, NA_real_, NA_real_, 
                                                                                                                                                                                                                                                                                                                                                                                         NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_)), row.names = c(6L, 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                     7585L, 9716L, 10160L, 10475L, 10617L, 10919L, 11066L, 11084L, 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                     11214L), class = "data.frame")




ID_line_NA <- which(is.na(Y[,c("a")]), arr.ind=TRUE)
## print(ID_line_NA)

for(i in ID_line_NA){

  print(i)

  ## Define the variable y
  seq_n <- seq(0, Y[i, c("n")], by = 1)
  y <- Y[i, c("d")] + (((Y[i, c("TR")]*Y[i, c("b")])/(Y[i, c("k1")]))*(Y[i, c("nr")]/(Y[i, c("na")] + Y[i, c("nd")] + Y[i, c("nr")])))*seq_n
  ## summary(y)
  out <- data.frame(n = seq_n, y = y)
  ## plot(out$n, out$y)

  ## Build the linear model to find the values of parameters that give the best fit 
  mod <- lm(y ~ n, data = out)
  ## print(mod)

  ## Define initial conditions
  a_threshold <- as.vector(coefficients(mod)[1])/(Y[i, c("k1")]*Y[i, c("k2")])
  test_a <- a_threshold
  test_e <- as.vector(coefficients(mod)[2])/test_a
  test_c <- (as.vector(coefficients(mod)[1])/test_a) - (Y[i, c("k1")]*Y[i, c("k2")])

  ## Build the nonlinear model
  fit <- tryCatch( nlsLM(y ~ a*(e*n + Y[i, c("k1")]*Y[i, c("k2")] + c), data=out,
                         start=list(a = test_a,
                                    e = test_e,
                                    c = test_c), lower = c(a_threshold, 0, 0), upper = c(Inf, Inf, Inf)),
                   warning = function(w) return(1), error = function(e) return(2))
  ## print(fit)

  if(is(fit,"nls")){

    ## Add the parameters a, e and c in the data frame
    Y[i, c("a")] <- as.vector(coef(fit)[c("a")])
    Y[i, c("e")] <- as.vector(coef(fit)[c("e")])
    Y[i, c("c")] <- as.vector(coef(fit)[c("c")])

  } else{

    print("Error in the NLM")

  }
}

If I'm reading the formula for `fit` correctly, your model seems to be of the form $$E[y]=a(ex_1 + x_2 + c) = (ac)+(ae)x_1+(a)x_2$$ where the $x_i$ are given by the data. Since this is a *linear* model in the parameters $(ac, ae, a),$ why not just estimate them directly using Ordinary Least Squares?? If the issue is that $a$ must exceed a lower threshold, simply check whether your estimate of $a$ exceeds it. If not, set $a$ to the threshold and use OLS to estimate $e$ and $c.$ — whuber, Mar 21 '19 at 14:08
Thank you very much whuber for your answer. Using your model, the x1 and x2 variable lengths differ. The model would be of the form: `E[y] = aex1 + ak1k2 + ac`. k1, k2, a, e and c have specific values for each row i. k1 and k2 are known and a, e and c needed to be estimate. — Pierre, Mar 21 '19 at 14:58
I cannot understand the meaning of "variable lengths differ" in this context. As an example, in the last iteration of your loop the data are 3001 observations of two variables, $n$ and $y$ and the expression `Y[i, c("k1")]*Y[i, c("k2")]` is a single number equal to 900,000. Thus your model is trying (in a rather strange way) to fit a linear function relating `y` to `n`, but using *three* instead of *two* parameters: of course it's going to fail! — whuber, Mar 21 '19 at 16:18
If we consider that `x1` corresponds with `n` and `x2` with `k1*k2`, n has 3001 values and `k1*k2` has a single value in the last iteration, thus the parameter estimation from OLS doesn't work because the x1 and x2 variable lengths differ. Why should my model use two parameters ? Is it not possible to estimate the three parameters, a, e and c ? Thank you very much for your help. — Pierre, Mar 21 '19 at 16:51
That `k1*k2` has a single value just means it can be omitted altogether and absorbed into the value of `c`. The reason why at most two parameters are needed is because that is all one uses to determine a linear function. — whuber, Mar 21 '19 at 17:48
Sorry, I am not sure to understand. Currently, the model is of the form: Y = aeX + a(k1k2 + c) with intercept = a(k1k2 + c) et slope = ae. The intercept and slope values are known and are defined from the model `mod`. The idea would be to replace (k1k2 + c) with c and fit the model Y = aeX + ac ? Thank you very much for your time. — Pierre, Mar 21 '19 at 19:34
If I'm reading your code correctly, `a`, `c`, and `e` are all parameters that it will estimate. Regardless, your account is correct: your model appears to be equivalent to `Y=(ae)X + (ac).` That uses three numbers to estimate the *two* parameters shown in parentheses. — whuber, Mar 21 '19 at 19:46
I am not sure to understand if I estimate a,e and c using OLS or nlsLM. For example, in the last iteration, aj = 0.2 and ea = 0.0004109, with j = k1k2 + c. But using these relationships, I don't see how it is possible to find a, e and j. — Pierre, Mar 21 '19 at 20:40
Exactly! When you use three parameters in place of two, nothing is definite. This is called "lack of identifiability." The problem isn't with the software or the procedure, it's with your model. You need to change it. — whuber, Mar 21 '19 at 20:45
I understand. The model is based on literature. I don't know how I can modify it. — Pierre, Mar 21 '19 at 22:49
Then you should be concerned that your implementation might not correctly reflect the literature. — whuber, Mar 21 '19 at 23:05
However, I have followed advice from the article. In addition, I have tested a solution by fixing the parameter c but I have also the error message Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimates. — Pierre, Mar 21 '19 at 23:48

Initial conditions for nonlinear models using the nlsLM function

0 Answers0