1

Firstly, I would like to say, I have read a great post but it doesn't quite answer my question. This post came very close, but I still couldn't solve my issue from it.

I would like to be able to run a beta regression and be able to create the prediction equation that I would ordinarily use in multiple regression. However, I am struggling when it comes to transforming/converting the calculated value to take into account the loglog link. My aim has been to get it to match up with the value that is calculated when using the betareg predict method.

I have therefore put together a dummy example below, perhaps I could get some advice on where I am going wrong, or what I need to do to get the correct value.

# Made up data
x1 = c(0.051,0.049,0.046,0.042,0.042,0.041,0.038,0.037,0.043,0.031)
x2 = c(0.11,0.12,0.09,0.21,0.18,0.11,0.13,0.11,0.08,0.10)
y = c(0.97,0.87,0.77,0.65,0.77,0.84,0.76,0.73,0.82,0.90)

data = data.frame(x1,x2,y)

# run beta regression on data using loglog link
regression.beta = betareg(y ~ x1 + x2, link = "loglog")

# summarise result: 
summary(regression.beta)

This result from the regression is as follows;

Call:
betareg(formula = y ~ x1 + x2, link = "loglog")

Standardized weighted residuals 2:
    Min      1Q  Median      3Q     Max 
-1.4901 -0.8370 -0.2718  0.2740  2.6258 

Coefficients (mean model with loglog link):
            Estimate Std. Error z value Pr(>|z|)  
(Intercept)    1.234      1.162   1.062   0.2882  
x1            31.814     26.715   1.191   0.2337  
x2            -7.776      3.276  -2.373   0.0176 *

Phi coefficients (precision model with identity link):
      Estimate Std. Error z value Pr(>|z|)  
(phi)    24.39      10.83   2.252   0.0243 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 12.06 on 4 Df
Pseudo R-squared: 0.2956
Number of iterations: 232 (BFGS) + 12 (Fisher scoring) 

Using the following method results in a value of 0.874.

# predict result of first row
predict = predict(regression.beta, newdata = data[1,])

When I try to recreate this using the regression equation, I do the following...

outcome = 1.234 + (0.051*31.814) - (0.11*-7.746)
outcome = 3.708

My understanding is that I now need to transform this outcome variable to get it to between 0 and 1. I have read in places that I need to antilog the outcome variable, but I cannot get this to work so that it is similar to the outcome from the predict method.

I'd be grateful for any advice.

Thanks in advance.

Tim
  • 51
  • 5
  • 5
    Your question might rest on misunderstanding the *loglog* link you specified is a *logarithm*--but it is not. The link for a response $\hat y$ is $-\log(-\log(\hat y)).$ You can obtain the desired result using `predict(regression.beta, newdata = data[1,], type="response")`. One more thing: your computation of `outcome` doubly negates the last term. It needs to be *added* to produce $\hat y =2.004454.$ The log-log transform of that indeed is $0.8735228...$ – whuber Dec 13 '18 at 21:26
  • 3
    ... i.e. `exp(-exp(-( 1.234 + (0.051*31.814) + (0.11*-7.746) )))` gives `0.8739485` – Henry Dec 13 '18 at 22:24
  • Thanks for taking the time to respond. I now have this working correctly. As you can see I am trying to relate the process back to how I would normally use a multiple regression. I do have a follow-up question about the interpretation of the coefficients when using the log-log link, but I will post this separately. – Tim Dec 14 '18 at 07:35

0 Answers0