2

I start to study MCMC and Bayesian inference. Bayesian inference method usually needs to estimate posterior probability,like below:

$P(\theta|y)=P(\theta)*P(y|\theta)/NormalizationFactor$

$y={y_1,y_2,y_3,...y_N}$ is already known random sequence,and $p(y|\theta)$ and $p(\theta)$ are also known.

To my knowledge, MCMC is mainly used to generate samples from some specific complex distribution.

I'm confused about MCMC usage in this scenario, so my question to ask is how to use MCMC to estimate the $P(\theta|y)$ based on the above sequence $y$?

Moreover, because $\theta$ is a random variable, different $\theta$ conresponds to different $P(\theta|y)$, how to select optimal $\theta$ as current best estimate?

Ntydrm
  • 123
  • 4

1 Answers1

1

As you said, MCMC estimates the posterior distribution. To get a point estimate for $\theta$, it's your decision to use MAP, posterior expectation or something else. MCMC creates samples from the unnormalized version on the RHS, i.e. $p(\theta)p(y|\theta)$. You'll need to choose a prior for $\theta$ and also express the likelihood of the data, i.e. $p(y|\theta)$.

For example, $y_i$ can be Bernoulli RVs and can be assumed iid, where $\theta$ is the probability of having $y=1$. Then, we can write the likelihood as follows:$$p(y|\theta)=\prod_{i=1}^N\theta^{y_i}(1-\theta)^{y_i}$$

gunes
  • 49,700
  • 3
  • 39
  • 75
  • Can I understand this way: use MCMC to generate theta samples according to distribution p(theta|y), and then select one best theta sample from them via MLE to maximize p(theta)*p(y|theta), here y is known sequence. – Ntydrm May 20 '20 at 10:12
  • Basically, yes, but note that which point estimate to choose (i.e. MAP, expected value) depends on you. You're given the distribution, in which you can also calculate confidence intervals, variance etc. – gunes May 20 '20 at 10:25
  • Got it.MCMC is useful to speed up statistics calculation.In pratical, how to define the prior distribution of theta which should be hyperparameter, is it mainly based on experience.do you have some simple example for reference? – Ntydrm May 20 '20 at 10:35
  • Generally, based on experience. Choice of prior has always been a cryptic part in bayesian methods. – gunes May 20 '20 at 11:40