20

I have some data that is bounded between 0 and 1. I have used the betareg package in R to fit a regression model with the bounded data as the dependent variable. My question is: how do I interpret the coefficients from the regression?

Nick Cox
  • 48,377
  • 8
  • 110
  • 156
Thomas Jensen
  • 1,033
  • 1
  • 12
  • 22
  • 1
    Give this pdf a read: http://cran.r-project.org/web/packages/betareg/vignettes/betareg.pdf Lots of useful examples that should answer your question. –  Jul 04 '13 at 15:20

1 Answers1

32

So you need to figure out what scale you are modeling the response on. In the case of the betareg function in R we have the following model

$$\text{logit}(y_i)=\beta_0+\sum_{i=1}^p\beta_i$$

where the $\text{logit}(y_i)$ is the usual log-odds we are used to when using the logit link in the glm function (i.e., family binomial) in R. Thus the beta coefficients that betareg returns are the additional increase (or decrease if the beta is negative) in the log-odds of your response. I am assuming you want to be able to interpret the betas on the probability scale (i.e., on the interval (0,1)) thus once you have you beta coefficients all you need to do is simply change the response, i.e.,

$$\text{logit}(y_i)=\beta_0+\sum_{i=1}^p\beta_i\Rightarrow y_i=\frac{e^{\beta_0+\sum_{i=1}^p\beta_i}}{1+e^{\beta_0+\sum_{i=1}^p\beta_i}}$$

Thus you should realize that we are basically using the same results and interpretations from standard generalized linear modeling (under the logit link). One of the main differences between logistic regression and beta regression is that you are allowing the variance of your response to be much larger than it could be in logistic regression in order to deal with the typical problem of over-dispersion.

Nick Cox
  • 48,377
  • 8
  • 110
  • 156
  • @Nick Cox Nick so when if you had a proportional response that was a proportion of species observed and a independent variable TEMPERATURE . My confusion with a betareg is what the coefficient indicates an increase of ....the odds of what? In a typical logistic regression because the outcome is categorical I get intuitively that ther is increase in odds of being in a category BUT with continuous proportion outcome how can you explain an increase with odds? If the temperature coef is .05 so exp(.05) = 1.05 that would say a one unit increase in temp leads to a 1.05 increase in what? – user3022875 Aug 13 '17 at 18:59
  • @user3022875 In the example you give, it represents an increase in the ratio of proportion species observed to proportion of species not observed. The odds is just the ratio between positive and negative classes (p/1-p), so rather than saying "odds" you can just describe the ratio explicitly. – Bryan Shalloway Feb 21 '19 at 09:07
  • 2
    so in the example from user3022875 the interpretation would be: one unit increase in temp leads to 5% increase in the ratio of proportion species observed to proportion of species not observed. or simply, one unit increase in temp leads to 5% increase in the ratio of proportion species observed. is that right, @BryanShalloway? – user1607 May 26 '19 at 17:20
  • @user1607 You may find it helpful to calculate marginal predictions (sometimes called postestimation procedures) so that you can return to the original scale. https://www.stata.com/stata14/fractional-outcome-models/ – Jeffrey Girard Apr 14 '20 at 16:00