2

I have a MCMC simulation that tries to fit a line to a linear set of data. The auto-correlation is very high for the slope parameter (~0.9), and low (~0.05) for the bias

What does a high auto-correlation say about a MCMC run?

Is it necessarily a negative thing?

What are the main causes of high auto-correlation?

Shayan Shafiq
  • 633
  • 6
  • 17
Boto
  • 37
  • 3
  • @Are you talking about high autocorrelation between values in your simulated chain? – epp Mar 06 '19 at 21:32
  • Yeah, the autocorrelation between links in the markov chain are highly correlated for the slope. – Boto Mar 07 '19 at 18:46
  • what sampling method are you using? – epp Mar 07 '19 at 20:25
  • Pardon my ignorance on the terminology. Is a sampling method the algorithm that I am running, or the way I am sampling from the proposal distribution? I'm using the Metro-Hastings algorithm. I am sampling from the proposal distribution by taking a random number from a Gaussian distribution with mean and standard deviations defined a prior individually for each parameter (slope and bias). – Boto Mar 07 '19 at 22:35
  • If you're using MH then you can alter the autocorrelation by tweaking the standard deviation of your proposal distribution for that parameter. – epp Mar 08 '19 at 01:43
  • Thank you, I'll test that out. Are there other possible reasons for autocorrelation in a MCMC simulation? – Boto Mar 08 '19 at 13:43
  • the autocorrelation is a direct consequence of drawing samples from a proposal distribution centered on the current parameter value. This leads to correlation between realizations of your posterior. – epp Mar 08 '19 at 14:00
  • Could an issue also arise if the samples that are accepted are only the ones that are close to the mean of the distribution? – Boto Mar 08 '19 at 14:12
  • yeah so that's what I was saying about your proposal distribution variance needing to be changed to change the acceptance rate of your realizations – epp Mar 09 '19 at 03:05

1 Answers1

2

What does a high auto-correlation say about a MCMC run?

A high auto-correlation literally means that the Markov chain is taking small steps, and is not able to jump long distances. This may be because the proposal distribution has a small variance, which means that the jumps are being proposed too close to the current step. However, if you increase the proposal distribution variance too much, it is possible then that a lot of proposals are rejected, which will also increase the autocorrelation. Thus a balance is required. Usually an acceptable probability of .234 is the target for high dimensional problems, and .44 for 1 dimensional problems. Since you have two dimensions here, you should to tune your proposal variance so that the acceptance ratio is between .234 and .44

Is it necessarily a negative thing?

This is not simple to answer, but in short, yes, it is a negative thing. The long explanation is, if you have high autocorrelation, then that means that since each sample is highly correlated to the previous sample, the contribution of the new sample is less. That is, the sample is able to provide significantly less information than if it had low autocorrelation. In addition, because samples are highly correlated, it may also mean that the sample has not been able to explore the state space well enough. However, both the issues can be somewhat handled if you just choose a large sample size. There are ways to figure out how many samples are reasonable by accounting for the autocorrelation: my answer here.

What are the main causes of high auto-correlation?

One of them, as explained, is the variance of the proposal. Another factor that affects autocorrelation is the starting value of the Markov chain. If the starting value is far from an area of high probability, then the chain might move very slowly towards towards that area. Yet another reason can be multimodality of the target distribution. If the target is multimodal, the chain might get "stuck" in a mode for a long time, and not be able to jump across modes often. This increases autocorrelation as well. Finally, some target distributions are complicated in other ways than being multi-modal. It might have heavy tails, or have non-smooth density, which will also affect autocorrelation

Greenparker
  • 14,131
  • 3
  • 36
  • 80