Questions tagged [variational]

23 questions
4
votes
0 answers

Rao-Blackwellization in Black Box VI

In the paper, "Black Box Variational Inference," by Ranganath et al. (2013), the authors derive a Rao-Blackwellized estimator of the gradient of the evidence lower bound with respect to a single variational parameter $\lambda_i$. Notation: $q(z |…
4
votes
0 answers

What's the intuition behind variational learning in Deep NNs with attention mechanism

I'm trying to understand this paper: "Multiple Object Recognition With Visual Attention (Ba et al., 2015)", specifically I'm trying to understand section 3. which explains how the model is trained. Please let me jump right to the question, without…
Andrew B
  • 41
  • 1
3
votes
0 answers

Using calculus of variations to deduce lower bound on Pitman efficiency (asymptotic relative efficiency) between $t$-test and sign test

Let $Y_1,...,Y_n$ be iid draws from a location family $\{f(\cdot - \theta) : \theta \in \mathbb{R}\}$. $f$ is a symmetric density w.r.t. the Lebesgue measure on $\mathbb{R}$ with finite variance. We want to test $H_0 : \theta = 0$ versus $H_1 :…
2
votes
0 answers

What exactly is the point of computing a lower bound for the log partition function in variational methods in probabilistic graphical models?

Variational methods are applied when we are interested in a probability distribution $P$ but only have a tractably computable unnormalized form $\tilde{P}$ of $P$. Knowing the partition function $Z = \sum_x \tilde{P}(X)$ is desirable because it…
2
votes
3 answers

In VAE, why use MSE loss between input x and decoded sample x' from latent distribution?

Variational Autoencoders (VAEs) are based on the concept of Variational Inference (VI) and use two Neural Networks similar to Vanilla Autoencoders (AEs) for function approximation. I understood the derivation of the Evidence-Lower-Bound (ELBO) and…
2
votes
1 answer

what is -0.5 in VAE loss function with KL term

The VAE loss is composed of two terms: Reconstruction loss KLD loss in the implementation there is -0.5 applied to KLD loss. Kindly let me know what is this -0.5
2
votes
0 answers

Proportion of Variance for Variational Autoencoders

For example for PCA, the proportion of variance explained is proportional to the eigenvalue of the respective feature. Now for VAEs, is there a way to estimate the amount of variance that is explained by a single latent factor of variation? Or…
besterma
  • 21
  • 3
2
votes
1 answer

Variational inference with discrete variational parameters

Typically Variational Inference relies on taking gradient steps on KL divergence between the variational and true posterior, or on the ELBO. This does not seem valid when variational parameters are discrete (since gradients wrt those arguments are…
2
votes
1 answer

Variational inference with deterministic dependencies between variables

Suppose I have a probabilistic graphical model shown in the picture, in which all variables are binary, $c_1$ and $c_2$ are observed, and I want to use mean-field variational inference to estimate beliefs about the remaining variables. Suppose…
Ruben van Bergen
  • 6,511
  • 1
  • 20
  • 38
1
vote
0 answers

How to choose the number of latent dimensions in VAE?

I have trained a VAE that can generate photos of human faces. I have isolated the dimension that correlates most to smiling and now I only want the VAE to generate smiling faces. May I know is it a usual practice to adjust the latent_dim value in…
Johnny Tam
  • 173
  • 3
1
vote
1 answer

Clarification of Equation for Variational Inference in Pattern Recognition and Machine Learning

I am looking at the derivation of variational inference and specifically the approach taken by Bishop in his book on page 465 as illustrated in the Figure below. The key step is the statement below Equation 10.8 in which he says "... Thus maximizing…
AJR
  • 51
  • 2
1
vote
1 answer

How does the reparameterisation trick work for multivariate Gaussians?

I understand that for sampling from a univariate Gaussian, we can use $x = g(\epsilon) = \mu + \epsilon \sigma $ and then differentiate this transformation with respect to $\mu, \sigma$. How does this work for sampling a multivariate Gaussian with…
1
vote
1 answer

Generating variations on a class with VQ-VAE with PixelCNN prior

I'm trying to wrap my head around generating from a VQ-VAE with PixelCNN prior. Mostly, I'm curious how to go about generating variations of a given "class", or object. My (foggy) understanding, at the moment, is that the model quantizes the latent…
jbm
  • 121
  • 4
1
vote
1 answer

Mean Square Error as reconstruction loss in VAE

I'm trying to understand one specific formula in a paper that I'm reading: https://arxiv.org/pdf/1911.02469.pdf It's concerning equation 10: Unfortunately, the authors don't explain the context of what is σ. I also tried to find other sources. In…
Sandro
  • 15
  • 6
1
vote
1 answer

Two-step maximum likelihood inference

Suppose we have an latent r.v. $Z$ (not observed) and an observed r.v. $X$, where $X$ depends on $Z$ via some conditional distribution $p(x|z)$. Given $x$, we will try to infer $z$. Standard maximum likelihood inference asks: given $x$, find $z^*$…
D.W.
  • 5,892
  • 2
  • 39
  • 60
1
2