Questions tagged [approximate-inference]

60 questions
46
votes
1 answer

Variational inference versus MCMC: when to choose one over the other?

I think I get the general idea of both VI and MCMC including the various flavors of MCMC like Gibbs sampling, Metropolis Hastings etc. This paper provides a wonderful exposition of both methods. I have the following questions: If I wish to do…
13
votes
3 answers

what does one mean by numerical integration is too expensive?

I am reading about Bayesian inference and I came across the phrase "numerical integration of the marginal likelihood is too expensive" I do not have a background in mathematics and I was wondering what exactly does expensive mean here? Is it just…
9
votes
0 answers

Rao-Blackwellization in variational inference

The Black box VI paper introduces Rao-Blackwellization as a method to reduce the variance of the gradient estimator using score function, in section 3.1. However I don't quite get the basic idea behind those formulas, please give me some hint and…
7
votes
2 answers

Estimating the gradient of log density given samples

I am interested in estimating the gradient of the log probability distribution $\nabla\log p(x)$ when $p(x)$ is not analytically available but is only accessed via samples $x_i \sim p(x)$. There seems to be various possible solutions utilizing…
7
votes
1 answer

How is ABC more computationally efficient than exact Bayesian Computation for parameter estimation in dynamical systems (ODE) models?

Approximate Bayesian Computation has been suggested as an approach to parameter estimation for computationally intensive simulations, most commonly in population genetics, but also in dynamical systems, for example, Toni 2009 and applied in the…
6
votes
3 answers

Maximum likelihood estimator that is not a function of a sufficient statistic

I always read that every maximum likelihood estimator has to be a function of any sufficient statistic. The idea is that, if we are dealing with a random variable $X$ with mass or density function $f(x\mid\theta)$, and $T$ is a sufficient statistic…
6
votes
1 answer

What are the advantages of normalizing flow over VAEs with deep latent gaussian models for inference?

I am reading the normalizing flow paper and am a bit confused. It seems that being able to model complex (correlated?) posterior is one of the advantages of the proposed approach (Section 2.3, last paragraph). But for a deep latent Gaussian model…
5
votes
2 answers

Gradient of the expectation of a function w.r.t. distribution parameters

In section 2.2 of Kingma & Welling's paper on variational auto-encoders authors write the following equality for the gradient of the expectation of a function with respect to the parameters of the probability distribution: $$ \nabla_\phi…
5
votes
1 answer

Explanation of the 'free bits' technique for variational autoencoders

I have been reading through a couple of papers on the variational autoencoder model: 'Variational Lossy Autoencoder' and 'Improving Variational Inference With Inverse Autoregressive Flow'. There is one (perhaps very obvious) thing that is confusing…
4
votes
1 answer

variational inference derivation

According to this lecture note, Eq. 25 gives the coordinate ascent update for latent variable $z_k$ as follows $$q^*(z_k)\propto\exp(E_{-k}[\log{p(z_k,Z_{-k},x)}])$$ and I understand the derivation for this formula. But in the following Bayesian…
avocado
  • 3,045
  • 5
  • 32
  • 45
4
votes
1 answer

How Can I teach someone "sampling from a given distribution" is hard?

For many people I know, they do not think sampling from a given distribution is a hard problem in general. For example, many software provide functions do to sample from normal distribution or uniform distribution. How can I teach to other person…
Haitao Du
  • 32,885
  • 17
  • 118
  • 213
4
votes
0 answers

Convergence of approximate Gibbs sampling

We have a bivariate random variable $(X,Y)$ for which sampling is challenging. If we were to know how to sample from the conditionals $(X|Y)$ and $(Y|X)$, we could get samples from the joint using Gibbs sampling by iterating: $$x_{t+1} \sim…
4
votes
1 answer

Examples where the evaluation of the posterior distribution $p(Z|X)$ of the latent variables $Z$ is a central task?

In PRML Chapter 10 Approximate Inference, the fist sentence of the chapter says A central task in the application of probabilistic models is the evaluation of the posterior distribution $p(Z|X)$ of the latent variables $Z$ given the observed…
3
votes
1 answer

How to Test Linear Hypotheses about Parameters in Simulation-Based Indirect Inference

Setup: I have a model that produces a vector of aggregate outcomes, $\theta$, based on parameters, $\beta$. The relationship $\theta=\Theta(\beta)$ is stochastic and analytically intractable, but I can simulate draws of $\theta$ given $\beta$. I…
3
votes
1 answer

Expectation Maximisation vs Expectation Propagation in the context of Bayesian Networks

I am confused about Expectation Maximisation and Expectation Propagation algorithms in the context of Bayesian Networks, especially whether one comprise another. What is the difference between expectation maximisation and expectation propagation?…
1
2 3 4