15

I am trying to interpreting the output of nls(). I have read this post but I still don't understand how to choose the best fit. From my fits I have two outputs:

> summary(m)

  Formula: y ~ I(a * x^b)

  Parameters:
  Estimate Std. Error t value Pr(>|t|)    
  a 479.92903   62.96371   7.622 0.000618 ***
  b   0.27553    0.04534   6.077 0.001744 ** 
  ---
  Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

  Residual standard error: 120.1 on 5 degrees of freedom

  Number of iterations to convergence: 10 
  Achieved convergence tolerance: 6.315e-06 

and

> summary(m1)

  Formula: y ~ I(a * log(x))

  Parameters:
  Estimate Std. Error t value Pr(>|t|)    
  a   384.49      50.29   7.645 0.000261 ***
  ---
  Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

  Residual standard error: 297.4 on 6 degrees of freedom

  Number of iterations to convergence: 1 
  Achieved convergence tolerance: 1.280e-11

The first one has two parameters and smaller residual error. The second only one parameter but worst residual error. Which is the best fit?

emanuele
  • 2,008
  • 3
  • 21
  • 34
  • 4
    There's much more to assessing a model than looking at one or two summary statistics. What do the residuals look like? Do any of the data exhibit too much leverage? What do the goodness of fit diagnostics say? Does theory suggest one of these models should be preferred? For what values of $x$ do these fits differ substantially and does that matter? Etc. – whuber Sep 03 '12 at 14:45
  • 3
    I deleted my answer, which suggested using `AIC`, because a comment made a compelling case that AIC is not generally applicable for selection of `nls` fits. I would always try to decide for a nonlinear model based on mechanistic knowledge, particularly if the data set is as small as yours. – Roland Sep 03 '12 at 14:46
  • 1
    Hmmm. Would the original commenter on @Roland's now-deleted answer be willing to repost the comment? It's not immediately obvious to me why AIC would not be appropriate ... (although https://stat.ethz.ch/pipermail/r-help/2010-August/250742.html gives some hints) -- and as a final note, if you're trying to identify a power transformation, you might try Box-Cox transformationss (`boxcox` in the `MASS` package) – Ben Bolker Sep 24 '12 at 17:43
  • 1
    AIC could be used to select models. –  Aug 26 '15 at 04:26
  • @Roland I would have also loved to see this answer/comment. There is a lot of value in reading discussions on controversial answers here. – RTbecard Sep 12 '20 at 18:49

1 Answers1

3

You can simply use the F test and anova to compare them. Here are some codes.

> x <- 1:10
> y <- 2*x + 3                            
> yeps <- y + rnorm(length(y), sd = 0.01)
> 
> 
> m1=nls(yeps ~ a + b*x, start = list(a = 0.12345, b = 0.54321))
> summary(m1)

Formula: yeps ~ a + b * x

Parameters:
   Estimate Std. Error t value Pr(>|t|)    
a 2.9965562  0.0052838   567.1   <2e-16 ***
b 2.0016282  0.0008516  2350.6   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.007735 on 8 degrees of freedom

Number of iterations to convergence: 2 
Achieved convergence tolerance: 3.386e-09 

> 
> 
> m2=nls(yeps ~ a + b*x+c*I(x^5), start = list(a = 0.12345, b = 0.54321,c=10))
> summary(m2)

Formula: yeps ~ a + b * x + c * I(x^5)

Parameters:
   Estimate Std. Error  t value Pr(>|t|)    
a 3.003e+00  5.820e-03  516.010   <2e-16 ***
b 1.999e+00  1.364e-03 1466.004   <2e-16 ***
c 2.332e-07  1.236e-07    1.886    0.101    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.006733 on 7 degrees of freedom

Number of iterations to convergence: 2 
Achieved convergence tolerance: 1.300e-06 

> 
> anova(m1,m2)
Analysis of Variance Table

Model 1: yeps ~ a + b * x
Model 2: yeps ~ a + b * x + c * I(x^5)
  Res.Df Res.Sum Sq Df     Sum Sq F value Pr(>F)
1      8 0.00047860                             
2      7 0.00031735  1 0.00016124  3.5567 0.1013
>
Stat
  • 7,078
  • 1
  • 24
  • 49
  • 5
    More information on how to interpret the results? – skan May 01 '15 at 13:29
  • Please expand. With my dataset I get no output for F value and for Pr(>F). What is the point of running the anova analyses? I am only familiar with it being used for comparing categories not models. – user3386170 Feb 20 '18 at 19:12