4

I am analysing vegetation percentage cover data from grazed and ungrazed plots in R using a zero-inflated beta regression in package gamlss. Here are some example data.

site = c("A","A","A","A","B","B","B","C","C","C","D","D","E","E","F","F","F","A","A","A","A","B","B","B","C","C","C","D","D","E","E","F","F","F")
sample = as.factor(c(1:17,1:17))
treatment = c(rep("Grazed", 17), rep ("Ungrazed",17))
dg = c(0.550,0.350,0.750,0.070,0.000,0.000,0.000,0.005,0.000,0.030,0.000,0.005,0.010,0.030,0.250,0.400,0.500,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.003,0.000,0.010,0.010,0.000,0.010)
dat1 = data.frame(site, sample, treatment, dg)

I run the following model...

library(gamlss)
library(lme4)   
m1 = gamlss(dg ~ treatment+re(random = list(site = ~1, sample = ~1), method 
= "REML"), family = BEZI, data = dat1)

...and I get the following results with summary(m1)

******************************************************************
Family:  c("BEZI", "Zero Inflated Beta") 

Call:  gamlss(formula = dg ~ treatment + re(random = list(site = ~1,  
sample = ~1), method = "REML"), family = BEZI,      data = dat1) 

Fitting method: RS() 

------------------------------------------------------------------
Mu link function:  logit
Mu Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)       -1.60833    0.03317  -48.48  < 2e-16 ***
treatmentUngrazed -4.00953    0.25136  -15.95 4.22e-12 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

------------------------------------------------------------------
Sigma link function:  log
Sigma Coefficients:
        Estimate Std. Error t value Pr(>|t|)    
(Intercept)   6.4004     0.3442   18.59 3.07e-13 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

------------------------------------------------------------------
Nu link function:  logit 
Nu Coefficients:
        Estimate Std. Error t value Pr(>|t|)
(Intercept)   0.1178     0.3436   0.343    0.736

------------------------------------------------------------------
NOTE: Additive smoothing terms exist in the formulas: 
 i) Std. Error for smoothers are for the linear effect only. 
ii) Std. Error for the linear terms maybe are not accurate. 
------------------------------------------------------------------
No. of observations in the fit:  34 
Degrees of Freedom for the fit:  15.90911
  Residual Deg. of Freedom:  18.09089 
                  at cycle:  12 

Global Deviance:     -55.82532 
            AIC:     -24.0071 
            SBC:     0.2759435 
******************************************************************

Having consulted this question and Ospina and Ferrari 2012 my understanding is that

  • Mu represents the mean of the beta distribution part of the model for the interval (0,1)
  • Sigma represents the precision (or shape) of the beta distribution part of the model
  • Nu represents the Bernoulli distribution part of the model and probability of observing zero cover

The above referenced question mentions how to calculate an overall mean based on these parameters. Is there is a (valid) way to calculate an overall p-value for the model, i.e. test between treatments for values of the range [0,1), rather than just for mu?

  • 2
    If you shift your emphasis more towards "is there a (valid) way to calculate an overall p-value" and away from "how can I do this in R" you are more likely to survive the "close as off-topic" votes and get an answer to an interesting question. – jbowman Feb 05 '18 at 20:35
  • @jbowman. Fair enough, I've amended the question. – Philip Perrin Feb 05 '18 at 20:54
  • @Michael Chernick. As this question has been closed here, could it be migrated across to Stack Overflow please? – Philip Perrin Feb 12 '18 at 10:57
  • 2
    @jbowman I voted to reopen so that we could migrate it to SO instead of just closing it. Also, this may be relevant to CV, because it asks if there is an overall p-value for the model, which is actually an interesting statistical question. It gets to the point at how submodels in beta regression are treated as dependent or independent of one another. – Mark White Mar 10 '18 at 00:58
  • 1
    @MarkWhite - I concur. – jbowman Mar 10 '18 at 00:59
  • From my understanding of the model, the likelihood functions for the mu/sigma submodel and the nu submodel can be completely separated from one another in this case, so I don't think there is an "overall p-value" for the entire model. They are all different parameters being estimated, so I think there is one omnibus test for each submodel. This is an interesting question, though, and *not* just about implementation in R. – Mark White Mar 10 '18 at 01:03
  • 1
    @Mark White. For others with a similar question, I found this paper (https://www.tandfonline.com/doi/abs/10.1080/03610926.2014.938828) which does what I wanted, but it was beyond my abilities to implement it. It uses Fisher's combination method to get a single p value. I also found this implemented method (https://journal.r-project.org/archive/2015/RJ-2015-019/), but it does not appear to allow for nested random effects. So, in the end I just used a likelihood ratio test to compare the full model with treatment as a fixed effect with a null model without the fixed effect. – Philip Perrin Mar 11 '18 at 10:35
  • 1
    If it is Fisher's method which is holding you up the package metap (on CRAN) will do that for you. Disclaimer: I wrote the package. I agree this is on-topic, voted to reopen. – mdewey Mar 11 '18 at 12:28
  • @mdewey. Thanks, I think I looked at this package at the time, after reading a bit on Fisher's method and it looks simple enough to implement. I'm not sure what the combined p-value of the three parameters would actually represent however. – Philip Perrin Mar 11 '18 at 14:32

0 Answers0