I have a data frame containing 70000 rows. For each row, I am trying to apply the nlsLM
function (minpack.lm
package) to find the values of a
, e
and c
parameters that give the best fit to the data. Among the 70000 rows, 3412 rows failed with the error message: Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimates
. I don’t know why the initial conditions are not correct.
To better define the initial conditions, I have added some conditions based on the linear model:
intercept = a*k1*k2 + a*c
c = intercept/a - k1*k2 and c>= 0
slope = a*e
e = slope/a and e >= 0
a >= intercept / k1*k2 and a > 0
Here is the code with a reproducible example:
Y <- structure(list(d = c(0, 0, 1, 0, 0.200000002980232, 0, 0, 1,
0.200000002980232, 0.200000002980232), TR = c(0.000166487283422612,
5.28295131516643e-05, 0.148508191108704, 0.321905970573425, 0.160190761089325,
0.456221789121628, 0.47410836815834, 0.529531538486481, 0.0595778152346611,
0.464331150054932), k1 = c(3000, 3000, 3000, 3000, 3000, 3000,
3000, 3000, 3000, 3000), k2 = c(300, 300, 300, 300, 300, 300,
300, 300, 300, 300), n = c(3000, 3000, 3000, 3000, 3000, 3000,
3000, 3000, 3000, 3000), na = c(1000, 1000, 1000, 1000, 1000,
1000, 1000, 1000, 1000, 1000), nd = c(100, 100, 100, 100, 100,
100, 100, 100, 100, 100), nr = c(30, 30, 30, 30, 30, 30, 30,
30, 30, 30), b = c(100, 100, 100, 100, 100, 100, 100, 100, 100,
100), a = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), e = c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_, NA_real_), c = c(NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_)), row.names = c(6L,
7585L, 9716L, 10160L, 10475L, 10617L, 10919L, 11066L, 11084L,
11214L), class = "data.frame")
ID_line_NA <- which(is.na(Y[,c("a")]), arr.ind=TRUE)
## print(ID_line_NA)
for(i in ID_line_NA){
print(i)
## Define the variable y
seq_n <- seq(0, Y[i, c("n")], by = 1)
y <- Y[i, c("d")] + (((Y[i, c("TR")]*Y[i, c("b")])/(Y[i, c("k1")]))*(Y[i, c("nr")]/(Y[i, c("na")] + Y[i, c("nd")] + Y[i, c("nr")])))*seq_n
## summary(y)
out <- data.frame(n = seq_n, y = y)
## plot(out$n, out$y)
## Build the linear model to find the values of parameters that give the best fit
mod <- lm(y ~ n, data = out)
## print(mod)
## Define initial conditions
a_threshold <- as.vector(coefficients(mod)[1])/(Y[i, c("k1")]*Y[i, c("k2")])
test_a <- a_threshold
test_e <- as.vector(coefficients(mod)[2])/test_a
test_c <- (as.vector(coefficients(mod)[1])/test_a) - (Y[i, c("k1")]*Y[i, c("k2")])
## Build the nonlinear model
fit <- tryCatch( nlsLM(y ~ a*(e*n + Y[i, c("k1")]*Y[i, c("k2")] + c), data=out,
start=list(a = test_a,
e = test_e,
c = test_c), lower = c(a_threshold, 0, 0), upper = c(Inf, Inf, Inf)),
warning = function(w) return(1), error = function(e) return(2))
## print(fit)
if(is(fit,"nls")){
## Add the parameters a, e and c in the data frame
Y[i, c("a")] <- as.vector(coef(fit)[c("a")])
Y[i, c("e")] <- as.vector(coef(fit)[c("e")])
Y[i, c("c")] <- as.vector(coef(fit)[c("c")])
} else{
print("Error in the NLM")
}
}