MCMC Gamma Distribution

Question

I am applyig a MCMC simulation with a Gamma distribution. I am trying to simulate the rainfall in a city using data collected during 1000 days.

First step is to simulate the "data colleceted during 1000 days" that follows a Gamma distribution with $mean = shape * scale = 2 * 2 = 4$.

data <- rgamma(1000, shape, rate = 1/scale)

Second step is to prepare the prior and likelihood of gamma distrubiton. I created two functions to give support to my MCMC code:

# Log Gamma - Prioris
prioriGamma <- function(theta, lambda, nu){
  out <- (lambda-1)*log(theta) - theta*nu
  return(out)
}

# Likelihood Gamma
verossiGamma <- function(nArg, AlphaArg, BetaArg, dadosArg){
  verossi <- nArg*AlphaArg*log(BetaArg) - nArg*lgamma(AlphaArg) + (AlphaArg-1)*sum(log(dadosArg))
  return(verossi)
}

Then I code the MCMC simulation:

mcmc <- function(y, iter, sigma.rw, lambda.alpha, lambda.beta, nu.alpha, nu.beta, initAlpha, initBeta) {
  # # Parametros temp
  # y = data; sigma.rw = 1; lambda.alpha = 1; lambda.beta = 1; nu.alpha = 1; nu.beta = 1
  # 
  # find n from the data (ie the number of observations)
  n <- length(y)
  
  # first create a table which will store the samples; first column will be alpha, second will be beta
  out <- matrix(NA, nrow=iter, ncol=6)
  
  # initial values
  alpha.cur <- initAlpha
  beta.cur <- initBeta
  out[1,] <- c(alpha.cur, beta.cur, 0, 0, 0, 0)
  
  # mcmc loop starts here
  for (i in 2:iter) {

    ###############
    # update alpha (assume beta is fixed)
    ###############
    
    # propose a new value for alpha
    alpha.can <- rnorm(1, alpha.cur, sigma.rw)
    
    # if it is negative reject straight away else compute the M-H ratio
    if (alpha.can > 0) {
      
      # evaluate the loglikelihood at the current values of alpha.
      loglik.cur <- verossiGamma(n, alpha.cur, beta.cur, y) 
      
      # compute the log-likelihood at the candidate value of alpha
      loglik.can <- verossiGamma(n, alpha.can, beta.cur, y) 
  
      
      # log prior densities
      log.prior.alpha.cur <- prioriGamma(alpha.cur, lambda.alpha, nu.alpha)
      log.prior.alpha.can <- prioriGamma(alpha.can, lambda.alpha, nu.alpha)
      
      logpi.cur <- loglik.cur + log.prior.alpha.cur
      logpi.can <- loglik.can + log.prior.alpha.can
      
      # M-H ratio
      
      # draw from a U(0,1)
      u <- runif(1)
      
      if (log(u) < logpi.can - logpi.cur) {
        alpha.cur <- alpha.can
      }
    }
    
    ###############
    # update beta
    ###############
    
    beta.cur <- rgamma(1, alpha.cur*n + lambda.beta, rate = sum.y + nu.beta)  
    
    # store the samples
    out[i,] <- c(alpha.cur, beta.cur, loglik.cur, loglik.can, log.prior.alpha.cur, log.prior.alpha.can)
    
  }
  
  return(out)
}

Where I input gamma data simulated, lambda and nu of my alpha gamma prior distribution, lambda and nu of my beta gamma distribution, and two initials values for my Markov Chain.

Apparently the code is working well. Nevertheless, I am facing challenge in the theorical sense to understand exactly what is happening in the simulation. Due to that I have couple questions that I need some help.

Why is the Gamma Distribution appropriate to the problem? I understand that Gamma has a relationship with Poisson and Exponential distribution, making sense to use it to simulate couting process. Nevertheless I would like to understand better why is this usefull for the rainfall simulation.
What values should I pick up to use in the MCMC simulation?

res <- mcmc(y = data, iter = 10000, sigma.rw = 1, lambda.alpha = 1, lambda.beta = 1, nu.alpha = 1, nu.beta = 1, initAlpha = 6, initBeta = 2)

When I choose the values of the gamma distribution I did know that my mean should be around 4. But I don't which values should I use for the prioris lambda.alpha, lambda.beta, nu.alpha, and nu.beta in order to simulate correctly my belives about the data.

I realised that indeppendly of the init values I used, the MCMC converges to the true parameters.

The code I used from this source https://www.maths.nottingham.ac.uk/plp/pmztk/files/MCMC2-Seattle/Lab-Sessions/2/tutorial_gamma.html

I did some small changes to adapt for myself and I applied into a practial problem to work my theorical understanding of the problem.

Thank you for your support!

More than the R code is borrowed from [Theo's notes](https://www.maths.nottingham.ac.uk/plp/pmztk/files/MCMC2-Seattle/Lab-Sessions/2/tutorial_gamma.html): the whole Bayesian model is, along with the specific Metropolis-within-Gibbs resolution. It is thus unclear why the generic framework of the tutorial would apply to the application chosen by the OP. — Xi'an, Mar 27 '21 at 16:51

Xi'an · Accepted Answer · 2021-03-27T16:52:17.327

The question is not about the MCMC method, is not about the R code, but is rather about Bayesian inference.

The sampling model is a Gamma model$$x_1,\ldots,x_n \sim \mathcal Ga(\alpha,\beta)$$whose parameters $\alpha$ and $\beta$ are unknown and inferred from the data $x_1,\ldots,x_n$ using Bayesian inference.

The specific prior distribution on the parameter $(\alpha,\beta)$ is made of two Gamma distributions:$$\alpha \sim \mathcal Ga(\lambda_\alpha,\nu_\alpha)\qquad \beta \sim \mathcal Ga(\lambda_\beta,\nu_\beta)$$

The MCMC algorithm reproduce therein is a Metropolis-within-Gibbs algorithm that produces a Markov chain converging to the true posterior distribution $$\pi(\alpha,\beta|x_1,\ldots,x_n,\lambda_\alpha,\nu_\alpha,\lambda_\beta,\nu_\beta)$$

(Disclaimer: this is explained in more details in the tutorial written by Theo Kypraios, I just did not check the link before answering!)

Question 1: Why is the Gamma Distribution appropriate?

The Gamma distribution appears three times, as a sampling distribution and as two prior distributions. There may be a particular reason for picking a Gamma as the distribution of the data, but the other two Gammas are choices of prior distributions, hence do not correspond to a "truth". They may reflect some prior knowledge, or else be chosen for computational convenience. In particular, the Gamma prior on $\beta$ is a conjugate prior, meaning the posterior is also a Gamma distribution. In any case, the MCMC algorithm is adapted to this specific choice of prior (and all choices of $(\lambda_\alpha,\nu_\alpha,\lambda_\beta,\nu_\beta)$ and would obviously need be modified for other classes of priors. When mentioning "the MCMC converges to the true parameters" it should be stated as converging to the true posterior.

Question 2: What values should I pick up to use in the MCMC simulation?

As stated above, the algorithm formally operates the same for all values of the hyperparameter $(\lambda_\alpha,\nu_\alpha,\lambda_\beta,\nu_\beta)$. And the outcome of the MCMC algorithm is reflecting this choice. Changing the hyperparameter and re-running the MCMC algorithm does not bring any information on the choice of these.

why do I apply MC to alpha parameter and Gibbs Sampler for beta paramters? — Arduin, Mar 30 '21 at 03:25
You are free to use a Metropolis-Hastings step for the $\beta$ parameter. Using the exact conditional distribution avoids calibrating this Metropolis-Hastings step. — Xi'an, Mar 30 '21 at 05:25

MCMC Gamma Distribution

1 Answers1

Question 1: Why is the Gamma Distribution appropriate?

Question 2: What values should I pick up to use in the MCMC simulation?