Calculating different pseudo-$R^2$ for a betareg model

Question

Sorry if this is a bit long.. I've been trying to fit models predicting the % of area infested in a field (response between 0 and 100%, total of 61 fields), with four explanatory variables, two factorials (planting and monitor) and two covariates (area2peri and Tmin7_bef).

I used the betareg() with link = "cauchit" (after comparing LL and AIC for the same predictors with different link functions), and tried the different combinations of parameters for both mean and dispersion sub-models. Would like some help understanding the possibilities of using different pseudo-$R^2$ for the different models. I am referring to the table on the UCLA website.

There are a few things that are not clear to me, and looking at betareg documentation did not solve them..

The summary(betareg(...)) provides a pseudo-$R^2$ square that is the "squared correlation of linear predictor and link-transformed response". How is that calculated? and to which type of the pseudo-$R^2$ it relates (Efron's,McFadden's, Cox&Snell etc.)?

I tried calculating the different types myself, as shown in the betareg documentation p.20, fitted a null (intercept only) and full models and extracted the LogLik for both. However there were several issues:

The formula suggested for Mcfadden's $R^2$ in betareg is the inverse of the one presented in the UCLA website, the null model's LL is the numerator in betareg and the denominator in UCLA.. what am I missing?

Here are the results for the different $R^2$, as well as the full model summary:

    Call:
    betareg(formula = A2to5 ~ planting + area2peri_m + monitor + Tmin7_bef | planting + monitor, data = na.omit(pre_n0), 
        link = "cauchit")
    
    Standardized weighted residuals 2:
        Min      1Q  Median      3Q     Max 
    -2.1745 -0.6122  0.1443  0.9129  1.6225 
    
    Coefficients (mean model with cauchit link):
                  Estimate Std. Error z value Pr(>|z|)  
    (Intercept)  -0.766418   0.714323  -1.073   0.2833  
    plantinglate -1.013075   0.409167  -2.476   0.0133 *
    area2peri_m  -0.013529   0.006384  -2.119   0.0341 *
    monitor1      0.712917   0.277825   2.566   0.0103 *
    Tmin7_bef     0.111958   0.049065   2.282   0.0225 *
    
    Phi coefficients (precision model with log link):
                 Estimate Std. Error z value Pr(>|z|)    
    (Intercept)    0.9635     0.2577   3.738 0.000185 ***
    plantinglate   0.4337     0.3331   1.302 0.192929    
    monitor1      -0.2295     0.3145  -0.730 0.465674    
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 12.55 on 8 Df
Pseudo R-squared: 0.07703
Number of iterations: 31 (BFGS) + 1 (Fisher scoring)


Mfull <-as.vector(logLik(betareg_cauchit)) # 12.54867
Mintercept <- as.vector(logLik(betareg_cauchit_intonly)) #4.207168
n <- betareg_cauchit_intonly$n #61

## McFadden's pseudo-R-squared (explained portion of variance)- according to betreg documentation
1 -(Mintercept/Mfull) #0.6647

## McFadden's pseudo-R-squared - according to UCLA
1 -(Mfull/Mintercept) #-1.9827

## adjusted McFadden's pseudo-R-squared
1-(Mfull-8)/Mintercept #-0.08117

## Cox&Snell (improvment over null model)
1-((Mintercept/Mfull)^(2/n)) #0.0352

max_cox_senll <- 1-Mintercept^(2/n) #-0.0482

## Cragg & Uhler’s (improvment over null model)
(1-((Mintercept/Mfull)^(2/n)))/max_cox_senll #-0.73

Which brings me to:

In all of my calculation attempts, hopefully not erroneous, I did not achieve the summary's pseudo-$R^2$ (relating to the first question) but what is the meaning of a negative value? Is my fit that bad?
Finally, I get that there is no consensus of which type to use, or should one report the pseudo-$R^2$ at all, but how can I really judge if my models is able to explain something in the world?

Achim Zeileis · Accepted Answer · 2021-11-16T21:50:36.247

The pseudo-R-squared reported by betareg is the squared correlation of the linear predictor and the link-transformed response (default link: logit). For the mj_vd model from example("MockJurors", package = "betareg") that you cite, this can be replicated via:
```
summary(mj_vd) ## reports: Pseudo R-squared: 0.03885
cor(qlogis(MockJurors$confidence), predict(mj_vd, type = "link"))^2 ## [1] 0.03885128
```
This pseudo-R-squared is not explicitly mentioned in the UCLA website but it is of type 3 (square of correlation).
The McFadden pseudo-R-squared provided on the UCLA website is for discrete data where the likelihood contributions are actually probabilities between 0 and 1 (and thus the corresponding log-likelihood are negative). This is not the case for beta regression. In this case Smithson & Verkuilen (2006) also call this "proportional reduction of error (PRE)". This has the inverse ratio of the log-likelihoods. See also: calculating a pseudo R2 value when deviance is negative
The negative value for the McFadden pseudo-R-squared is due to the inverse ratio of log-likelihoods. This only works if both are negative (e.g., as in logistic regression) but here both log-likelihoods are positive.
I generally find the pseudo-R-squared to be of rather limited use in beta regression. Whether the model fit is useful (for you) depends on many aspects, e.g., whether you mostly want a model for the mean or you are really interested in a probabilistic fit. A comprehensive answer for this is beyond the scope of this question, though. Alternative means of model comparisons could include information criteria (AIC or BIC), scores for the mean predicitons like (root) mean-squared error (MSE, RMSE), or probabilistic scoring rules (CRPS, log-score), or graphical model assessments (quantile-quantile plots, PIT histograms, etc.).

Thank you @AchimZeileis for your quick reply! It took me some time to go over it thoroughly, and would like to ask for some clarifications: 1. Why is the inverse ratio being used on beta regression rather than discrete distributions? Smithson & Verkuilen do not provide an explanation and the original paper by McFadden is beyond my abilities for now... 2. I would like to understand further your reply to question 4, could you please direct me to some more information on the subject? I understood it so that probabilistic fit is the method to model the mean. — Roni Gafni, Nov 16 '21 at 11:53
I have included some further pointers in 2. and 4. Regarding a standard reference for the proportional reduction of error (PRE) I'm not sure whether there is a simple, more accessible reference. — Achim Zeileis, Nov 16 '21 at 21:52
Yes, I came across this question yesterday.. I will further explore those options you gave on point 4, thank you very much @AchimZeileis for all your helpful explanations! In general, not just here :) — Roni Gafni, Nov 17 '21 at 08:12

Calculating different pseudo-$R^2$ for a betareg model

1 Answers1

Linked