1

I've got a quick question with regards to accept-reject Monte Carlo integration that I can't solve. Suppose I want to integrate some function, $f(x,y)$, with samples of $x, y$ from $p(x,y)$.

Now, with Monte Carlo integration the integral can be approximated by,

$I = \int f(x,y) \ dx \ dy = \int \frac{f(x,y)}{p(x,y)} p(x,y) = \mathbb{E}[f(x,y)]_{x,y \sim p(x,y)} \approx \frac{1}{N}\sum_{i}^{N} \frac{f(x_i, y_i)}{p(x_i, y_i)}$

In order to sample i.i.d samples from p(x,y) I use the accept-reject method. This methods includes sampling uniformly across the domain of the integrand, calculates the integrand value at $f(x,y)$ then samples uniformly $u \sim [0, 1)$. If $u < f(x,y)$, accept the values of $(x,y)$ and repeat the process until $N$ samples have been calculated.

The probability distribution function, pdf, I use is f(x,y) because it holds the minimum variance. The problem I have is when I calculate the integral via the sum of $\frac{1}{N}\sum_{i}^{N} \frac{f(x_i, y_i)}{p(x_i, y_i)}$ I get 1. It's clear because $f$ and $p$ are the same, so $f(x,y)/p(x,y)$ will always equal 1. The problem is, how can I normalise the estimator such that the sum above gives the correct result (i.e. the integral value of $\pi$) and not the normalised result?

Many thanks in advance!

1 Answers1

1

There are several levels of confusion:

  1. the approximation \begin{align}I &= \int f(x,y) \,\text{d}x \,\text{d}y \\&= \int \frac{f(x,y)}{p(x,y)} p(x,y) \,\text{d}x \,\text{d}y \\&= \mathbb{E}_{(X,Y)\sim p}[f(X,Y)] \\&\approx \frac{1}{N}\sum_{i}^{N} \frac{f(x_i, y_i)}{p(x_i, y_i)} \end{align} is called importance sampling. It is one form of Monte Carlo integration.

  2. sampling i.i.d samples from $p(x,y)$ may be feasible by the accept-reject method but it should not imply $f(\cdot,\cdot)$ at all (in general). If, for instance, $p(x,y)\le M$ over the domain/support $\mathfrak D$ of the integrand, then sampling $(X,Y)$ uniformly over this domain $\mathfrak D$ and accepting this realisation if $$u\le A p(x,y)/M\qquad u∼\mathcal U(0,1)$$ where $A$ is the volume of the domain is a correct version of the algorithm. An alternative to the Uniform may be more efficient.

  3. the optimal importance distribution function, $p$, is indeed proportional to $f$, namely $$p(x,y)=\frac{f(x,y)}{I}$$ assuming $f$ is non-negative. In that case, $$\frac{1}{N}\sum_{i}^{N} \frac{f(x_i, y_i)}{p(x_i, y_i)} = I$$ even for $N=1$ and the variance is zero. This optimality result is of course formal, i.e., it cannot be used in practice since it depends on the unknown integral value $I$.

  4. If a sample from $p_f\propto f$ can be produced (e.g. by accept-reject techniques), there exist unbiased estimators of $1/I$. The generic identity $$\int \frac{\alpha(z)}{f(x)}\,\frac{f(z)}{I}\,\text{d}z= \int \frac{\alpha(z)}{I}\,\text{d}z=I^{-1}$$shows that for any probability density $\alpha(\cdot)$ with support within the domain $\mathfrak D$, the harmonic estimator$$\frac{1}{N}\sum_{n=1}^N \frac{\alpha(z_n)}{f(z_n)}\qquad z_1,\ldots,z_N\sim p_f(x)$$is converging to $I^{-1}$. The instrumental density $\alpha(\cdot)$ must however be chosen such that the variance of the weight $\frac{\alpha(Z_n)}{f(Z_n)}$ is finite.

Xi'an
  • 90,397
  • 9
  • 157
  • 575
  • Thanks for the quick response! So, because my pdf is effectively dependent on $f$ I can't use it? I could take a ratio of the points under the integral against the total number of points and do a crude MC calculation but I can't calculate a variance on that method. So, I assume there's no way to calculate $I$ using this unless I pick a pdf which isn't dependent on $f$? – AlphaBetaGamma96 May 11 '20 at 19:07
  • 1
    No this is not what I mean: $p$ should depend on $f$ to make the variance small, but the choice $p \propto f$ is not possible. – Xi'an May 11 '20 at 19:31
  • Ah, ok. It makes sense now. Thank you for the help, @Xi'an! – AlphaBetaGamma96 May 12 '20 at 13:09
  • 1
    This is but one of the many paradoxes related with importance sampling!!! – Xi'an May 12 '20 at 13:14