2

I've got some data that looks like it is Gamma distributed. I've constructed the prior distribution from mean=232 and standard deviation = 150, which yield the Gamma distribution parameters:

a_prior (shape) =2.392 and b_prior (scale) =96.98

Now, I'm looking at the Wikipedia page for conjugate priors, and the update rules for the Gamma hyperparameters are:

$$\alpha_0 + n\alpha,\beta_0+\sum_{i=1}^n x_i$$

The mean of my sample data is $278$ and the standard deviation of the sample data is $162$. These yield $a$ and $b$ to be $2.944$ and $94.4$, respectively. The sample data consists of $15$ data points.

I don't understand the update rules from Wikipedia. As I can tell, the posterior a' = prior_a + n * a, which in this case is 2.39 + 15 * 2.944, which would give as the posterior shape parameter 46.49, which seems quite large.

Not to mention the posterior b' parameter, which looks like it's the prior b + sum(data points). My data points add up to 4451, so the posterior b' is:

posterior b' = 96.98 + 4451 = 4548, which, again seems quite large for the scale parameter.

Can someone tell me how the updating for a conjugate Gamma should go? It seems to me that the larger your sample size n is, the more the shape parameter will blow up. Similarly, the more data points you have in the sum, the scale parameter will grow extremely large.

Clearly I am misunderstanding something or then the data is simply not Gamma distributed.

I looked at the Gamma conjugate for Poisson, which contains calculating the equivalent sample size, but I don't see how it relates to updating Gamma prior to Gamma posterior.

Any help on understanding my misunderstanding is appreciated.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
Jermu
  • 21
  • 2

1 Answers1

2

The Gamma prior is not conjugate for the Gamma sampling model. See Conjugate prior for a Gamma distribution

The following R code shows how to sample from the posterior of a Gamma model with uniform priors on $[0,1000]$ for the scale and shape parameters, you can play with other priors, using an adaptive MCMC sampler.

set.seed(12345)
# Simulated data
data = rgamma(100,shape = 2.5,scale=97)

# Histogram and likelihood fit
hist(data)
library(MASS)
fitdistr(data,"gamma")


# -log posterior
mlogpost = function(par){
loglik = sum(dgamma(data,shape=par[1],scale = par[2],log=T))
logprior = -2*log(1000)
return(-loglik - logprior)
}


# Library for the MCMC sampler
library(Rtwalk)
# Initial point
init = c(2.5,97)

# Support of the posterior
Support <- function(x) {
    ((0 < x[1])&(x[2]>0))   
}
# Random initial points (see documentation of Rtwalk)
X0 <- function(x) { init + runif(2,-0.1,0.1) }

# Sampler
info <- Runtwalk( dim=2,  Tr=105000,  Obj=mlogpost, Supp=Support, x0=X0(), xp0=X0()) 

#burn in and thinning the posterior samples
ind = seq(5000,105000,by=10)
sc = info$output[,1][ind]
sh = info$output[,2][ind]

# histograms of the posterior samples
hist(sc)
hist(sh)

# summary of the posterior samples
summary(sc)
summary(sh)
Sale
  • 96
  • 2