Questions tagged [tweedie-distribution]

A family of distributions from the exponential dispersion family with a power-law mean-variance relationship. For power $p$ between 1 and 2, it is a compound Poisson-Gamma distribution that has point mass at zero and is continuous on positive numbers.

Tweedie distributions are distributions from the exponential dispersion family with a power-law mean-variance relationship. For power $p$ between 1 and 2, it is a compound Poisson-Gamma distribution that has point mass at zero and is continuous on positive numbers:

$$\sum_{i=1}^NX_i$$

where $N\sim\text{Pois}(\lambda)$, $X_i\sim\text{Gamma}(\alpha_i,\beta_i)$ and the $X_i$s and $N$ are all independent.

For more on this distribution refer to the wikipedia page or Bent Jorgenson's paper. There is also a book by Jorgenson on the same topic.

57 questions
33
votes
2 answers

How to model non-negative zero-inflated continuous data?

I'm currently trying to apply a linear model (family = gaussian) to an indicator of biodiversity that cannot take values lower than zero, is zero-inflated and is continuous. Values range from 0 to a little over 0.25. As a consequence, there is quite…
19
votes
3 answers

Can a model for non-negative data with clumping at zeros (Tweedie GLM, zero-inflated GLM, etc.) predict exact zeros?

A Tweedie distribution can model skewed data with a point mass at zero when the parameter $p$ (exponent in the mean-variance relationship) is between 1 and 2. Similarly a zero-inflated (whether otherwise continuous or discrete) model may have a…
13
votes
2 answers

Possible to evaluate GLM in Python/scikit-learn using the Poisson, Gamma, or Tweedie distributions as the family for the error distribution?

Trying to learn some Python and Sklearn, but for my work I need to run regressions that use error distributions from the Poisson, Gamma, and especially Tweedie families. I don't see anything in the documentation about them, but they are in several…
11
votes
1 answer

What is use of Tweedie or poisson loss/objective function in XGboost and Deep learning models

I am looking at few competitions in kaggle where people used tweedie loss or poisson loss as objective function for forecasting sales or predicting insurance claims. Can someone please explain the use/need for using tweedie or poisson instead of…
11
votes
1 answer

What is the canonical link function for a Tweedie GLM?

I was just introduced to the Tweedie distribution (see this or this) but I'm having a hard time finding what the link function is for a Tweedie generalized linear model. Thoughts?
9
votes
1 answer

Tweedie p parameter Interpretation

From Wikipedia (http://en.wikipedia.org/wiki/Tweedie_distribution) we know that The Tweedie distributions include a number of familiar distributions as well as some unusual ones, each being specified by the domain of the index parameter. We have…
Eric
  • 820
  • 1
  • 7
  • 18
7
votes
1 answer

Given a GLM using Tweedie, how do I find the coefficients?

Let $Y$ be a random variable that obeys the Tweedie distribution for parameter $\alpha = 1.1$. Let the link function be the natural log. Assume that we have a database of numbers of the form $(y_1, x_{1,1}, x_{1,2}, ..., x_{1,m})$ $(y_2, x_{2,1},…
6
votes
1 answer

Chi Square vs F Tests for GLM Model Comparisons

I've been creating some models in R using glm() and rxGlm(). I'm experienced in building GLMs but my memory of some of the underlying theory is a little rusty. I'm interested in comparing model fits for nested models using chi-square tests, F tests,…
6
votes
1 answer

Skewness of Tweedie distribution

Tweedie distributions are a family of distributions from the exponential dispersion family that have power-law mean-variance relationship: \begin{align} \mathbb E[X] &= \mu \\ \operatorname{Var}[X]&=\phi \mu^p \end{align} What is the formula for…
5
votes
0 answers

generalized linear mixed-effects models R^2 and the tweedie distribution

I am modelling data exhibiting a tweedie distribution in R using glmer (package lme4). To compare the models I would like to use the AIC and R^2. I have a couple of question on this (example code at the end): 1) Should I even be using the tweedie…
5
votes
1 answer

A model for non-negative data with many zeros: pros and cons of Tweedie GLM

I analyze technical measurement data with the aim of developing a forecasting model. The data is given as a non-negative time series (data per hour). The data looks quite wild and contain many zeros. I expect these zeros to be the result of…
Richi W
  • 3,216
  • 3
  • 30
  • 53
5
votes
1 answer

Gamma vs tweedie distribution for large productivity dataset

I'm running some GAMs using the mgcv R package on a dataset with ~8.5k observations, where productivity is the response and environmental conditions are the covariates. However I am unsure of which distribution to use and was seeing some advice. The…
4
votes
0 answers

R codes for Tweedie compound Poisson gamma

I have modeled claim frequency data using Poisson regression and claim amount using gamma. I have seen the Tweedie compound Poisson gamma distribution used to model aggregate claim data. I am quite new to R and I am trying best to model it using the…
4
votes
2 answers

Does the dependent variable in a GLM have to be transformed before running the model or does the model do it?

I'm trying to fit a Tweedie model with statsmodels and was wondering if I have to transform the dependent variable before I run the model or if statsmodels does that automatically? My model is as follows: lost_cost_model = smf.OLS(y, x, family =…
4
votes
0 answers

How a tweedie glm handles an offset?

I am trying to fit a model with a glm using a tweedie family. I use a index parameter p between 1 and 2 to get a compound Poisson Gamma distribution to fit my data. But I want to use an offset only on the Poisson part of the regression. So my…
1
2 3 4