Confidence intervals for values estimated from the nonlinear regression model

Question

I have a question about nonlinear regression and confidence intervals for values estimated from the model. Here is my problem. I have sets of data where $X$ is the logarithm of the dose of a chemical substance and $Y$ is the response readout (normalized to be between 0 and 100%).

I am fitting logistic function to this data using nonlinear regression with 4 parameters:

$$y = A + \frac{B-A}{1+e^{-ax+b}}$$

Once I have estimated the $A, B, a,$ and $b$ parameters, I can estimate the value of $X$ at which $Y$ should take 20%, which is what I need. Everything up to this point is clear.

However, something is unclear to me:

How do I calculate 95% confidence interval for that estimated $X$ value?

I know there is software out there, which can probably do this for me, but I have to understand the principle and algorithm myself, because I have to implement it in my own library.

Any help will be appreciated.

EDIT Dear Eupraxis and Maarten, thank you for your response:

@Eupraxis - Here is the picture, the way I understood your explanation about using function prediction intervals to calculate 95% CI bounds of estimated X value. Is it what you proposed (sorry for hand drawings)?

enter image description here

What if my function cannot be re-expressed in linear way? I am not even sure how to do it for logistic curve anyway. Please also see my comment for Maarten.

@Maarten - Thank you for the article! I tried to read it, but frankly it was quite hard to comprehend. I found what it seems to be a much simpler description of delta method for calculation of prediction interval in nonlinear regression:

How to compute prediction bands for non-linear regression?

Then I was planning to use these 95% PI to calculate 95 CI for x' as Eupraxis suggested above. However, there is one thing I do not quite get in this delta method. Specifically, as described:

First, let's define G|x, which is the gradient of the parameters at a particular value of X and using all the best-fit values of the parameters. The result is a vector, with one element per parameter. For each parameter, it is defined as dY/dP, where Y is the Y value of the curve given the particular value of X and all the best-fit parameter values, and P is one of the parameters.)

Could you maybe explain it, perhaps with some numeric example I can follow? The rest of the algorithm seems to be clear to me.

Dear Eupraxis and Maarten, thank you for your response: **@Eupraxis** - Here is the picture, the way I understood your explanation about using function prediction intervals to calculate 95% CI bounds of estimated X value. Is it what you proposed (sorry for hand drawings)? ![enter image description here](http://i.stack.imgur.com/cZafb.jpg) What if my function cannot be re-expressed in linear way? I am not even sure how to do it for logistic curve anyway. Please also see my comment for Maarten. **@Maarten** - Thank you for the article! I tried to read it, but frankly it was quite hard to compreh — , Apr 17 '14 at 09:34

score 1 · Answer 1 · answered Apr 15 '14 at 07:35

1

You are probably looking for the delta method, see for example:

Oehlert, G. W. 1992. A note on the delta method. American Statistician 46: 27–29.

answered Apr 15 '14 at 07:35

Maarten Buis

19,189
29
59

score 1 · Answer 2 · answered Apr 15 '14 at 14:06

If you're fitting a logistic curve, then first re-express (via a transform) so that the predictor and predicted varialbes are linearly related. Then, form a predicion interval using standard linear regression techniques. In particular, you will want to find the smallest transformed X value whose upper 95% PI cutoff is equal to the transformed Y=20% value, as well as the largest transformed X value whose 95% PI is equal to the transformed Y=20% value. Then, you can back-tranform these values to the original scale to get you X interval.

score 1 · Answer 3 · answered Apr 17 '14 at 12:31

Consider trying to express your model so one of the parameters you fit is the concentration that brings you up to 20%. If you can fit that as one of the parameters, then the CI comes out of the nonlinear regression easily. Here are some notes I wrote for a related equation that might help.

Two notes:

If you really know your normalized data go from 0 to 100, and the normalization is based on solid controls, then you don't need to fit A and B, but can set those to constant values of 0 and 100. If you don't have good controls to normalize to, then there really is no need for normalization at all.
I'd suggest renaming your parameters, so you don't have both b and B in your model. Depending only on upper/lower case to distinguish entirely separate values seems likely to cause a bug someday.

Confidence intervals for values estimated from the nonlinear regression model

3 Answers3

Linked