Questions tagged [variational]
23 questions
4
votes
0 answers
Rao-Blackwellization in Black Box VI
In the paper, "Black Box Variational Inference," by Ranganath et al. (2013), the authors derive a Rao-Blackwellized estimator of the gradient of the evidence lower bound with respect to a single variational parameter $\lambda_i$. Notation:
$q(z |…

Ethan S
- 41
- 2
4
votes
0 answers
What's the intuition behind variational learning in Deep NNs with attention mechanism
I'm trying to understand this paper: "Multiple Object Recognition With Visual Attention (Ba et al., 2015)", specifically I'm trying to understand section 3. which explains how the model is trained.
Please let me jump right to the question, without…

Andrew B
- 41
- 1
3
votes
0 answers
Using calculus of variations to deduce lower bound on Pitman efficiency (asymptotic relative efficiency) between $t$-test and sign test
Let $Y_1,...,Y_n$ be iid draws from a location family $\{f(\cdot - \theta) : \theta \in \mathbb{R}\}$. $f$ is a symmetric density w.r.t. the Lebesgue measure on $\mathbb{R}$ with finite variance. We want to test $H_0 : \theta = 0$ versus $H_1 :…

martingale_50
- 41
- 2
2
votes
0 answers
What exactly is the point of computing a lower bound for the log partition function in variational methods in probabilistic graphical models?
Variational methods are applied when we are interested in a probability distribution $P$ but only have a tractably computable unnormalized form $\tilde{P}$ of $P$. Knowing the partition function $Z = \sum_x \tilde{P}(X)$ is desirable because it…

user118967
- 359
- 1
- 7
2
votes
3 answers
In VAE, why use MSE loss between input x and decoded sample x' from latent distribution?
Variational Autoencoders (VAEs) are based on the concept of Variational Inference (VI) and use two Neural Networks similar to Vanilla Autoencoders (AEs) for function approximation. I understood the derivation of the Evidence-Lower-Bound (ELBO) and…

Jonas G.
- 51
- 1
2
votes
1 answer
what is -0.5 in VAE loss function with KL term
The VAE loss is composed of two terms:
Reconstruction loss
KLD loss
in the implementation there is -0.5 applied to KLD loss. Kindly let me know what is this -0.5

amir
- 21
- 1
2
votes
0 answers
Proportion of Variance for Variational Autoencoders
For example for PCA, the proportion of variance explained is proportional to the eigenvalue of the respective feature.
Now for VAEs, is there a way to estimate the amount of variance that is explained by a single latent factor of variation?
Or…

besterma
- 21
- 3
2
votes
1 answer
Variational inference with discrete variational parameters
Typically Variational Inference relies on taking gradient steps on KL divergence between the variational and true posterior, or on the ELBO. This does not seem valid when variational parameters are discrete (since gradients wrt those arguments are…

Dionysis M
- 794
- 6
- 17
2
votes
1 answer
Variational inference with deterministic dependencies between variables
Suppose I have a probabilistic graphical model shown in the picture, in which all variables are binary, $c_1$ and $c_2$ are observed, and I want to use mean-field variational inference to estimate beliefs about the remaining variables. Suppose…

Ruben van Bergen
- 6,511
- 1
- 20
- 38
1
vote
0 answers
How to choose the number of latent dimensions in VAE?
I have trained a VAE that can generate photos of human faces.
I have isolated the dimension that correlates most to smiling and now I only want the VAE to generate smiling faces.
May I know is it a usual practice to adjust the latent_dim value in…

Johnny Tam
- 173
- 3
1
vote
1 answer
Clarification of Equation for Variational Inference in Pattern Recognition and Machine Learning
I am looking at the derivation of variational inference and specifically the approach taken by Bishop in his book on page 465 as illustrated in the Figure below. The key step is the statement below Equation 10.8 in which he says "... Thus maximizing…

AJR
- 51
- 2
1
vote
1 answer
How does the reparameterisation trick work for multivariate Gaussians?
I understand that for sampling from a univariate Gaussian, we can use $x = g(\epsilon) = \mu + \epsilon \sigma $ and then differentiate this transformation with respect to $\mu, \sigma$. How does this work for sampling a multivariate Gaussian with…

Ben Gutteridge
- 51
- 2
1
vote
1 answer
Generating variations on a class with VQ-VAE with PixelCNN prior
I'm trying to wrap my head around generating from a VQ-VAE with PixelCNN prior. Mostly, I'm curious how to go about generating variations of a given "class", or object. My (foggy) understanding, at the moment, is that the model quantizes the latent…

jbm
- 121
- 4
1
vote
1 answer
Mean Square Error as reconstruction loss in VAE
I'm trying to understand one specific formula in a paper that I'm reading:
https://arxiv.org/pdf/1911.02469.pdf
It's concerning equation 10:
Unfortunately, the authors don't explain the context of what is σ. I also tried to find other sources. In…

Sandro
- 15
- 6
1
vote
1 answer
Two-step maximum likelihood inference
Suppose we have an latent r.v. $Z$ (not observed) and an observed r.v. $X$, where $X$ depends on $Z$ via some conditional distribution $p(x|z)$. Given $x$, we will try to infer $z$.
Standard maximum likelihood inference asks: given $x$, find $z^*$…

D.W.
- 5,892
- 2
- 39
- 60