1

Let's say I have an un-normalized probability density function $f(x)$, which is related to $\xi$ via $\xi = \frac{f}{c}$

I also have a sample set $S = \{x_i\}_{i=1}^n \sim \xi$ which is sampled from the normalized pdf $\xi$

Can $S$ then be used to determine the normalizing constant $c$?

That is, in a simple case taken from Wikipedia

$$p(x) = e^{-x^2/2}$$ so, $$\int_{-\infty}^\infty p(x)dx = \int_{-\infty}^{\infty}e^{-x^2/2}dx = \sqrt{2\pi} = c$$ and if the function $\phi(x)$ is defined as: $$\phi(x) = \frac{1}{\sqrt{2\pi}}p(x) = \frac{1}{\sqrt{2\pi}}e^{-x^2/2}$$ so that $$\int_{-\infty}^{\infty}\phi (x)dx = \int_{-\infty}^{\infty}\frac{1}{\sqrt{2\pi}}e^{-x^2/2}dx = 1$$

then $\frac{1}{\sqrt{2\pi}}$ is the normalizing constant of $p(x)$.

So in the case of the sampling set $S$, would I determine the approximate normalized pdf $\xi$ via a histogram (and potentially curve fitting) and compare it to $f(x)$?

Is that the same as $c = \int_{-\infty}^{\infty} p(x)dx$ from Wikipedia?

Goal: Find $c$ by using $S$, is this possible?

  • 1
    Is $f$ the unnormalized version of $\xi$? What exactly do you want to achieve, please be more specific. – Haotian Chen Apr 30 '20 at 21:31
  • @HaotianChen Correct, I know $f$ up to a normalization constant $c$. I also have a sample set $S$ from $\xi$, which is related to $f$ by $\xi = \frac{f}{c}$. I want to know if I can find $c$ via a similar procedure outlined in the article. –  Apr 30 '20 at 21:39

2 Answers2

2

This is a fairly common problem for Bayesian statistics where the posterior distribution$$p(\theta|x)=\dfrac{f(x|\theta)\pi(\theta)}{\int_\Theta > f(x|\theta)\pi(\theta)\,\text{d}\theta}=\dfrac{f(x|\theta)\pi(\theta)}{m(x)}$$most often involves an intractable normalising constant $m(x)$.

One of my answers to earlier questions on that topic lists a range of solutions, based on simulation. A book-length entry is found in Chen, Shao and Ibrahim (2001).

Reverting to the question and its notations, if the only available material is provided by the sample$$\mathfrak S = \{x_i\}_{i=1}^n \sim \xi$$ with no further access to simulation, contrary to the previous answer, a range of solutions can be found by a reverse version of the importance sampling method, namely that, for any density function $\alpha(\cdot)$ [with the same support as $f(\cdot)$, at most], the following general identity holds:$$\mathbb{E}_\xi\left[\frac{\alpha(X)}{f(X)}\right]=\int_{\mathfrak X} \dfrac{\alpha(x)}{f(x)}\,\xi(x)\,\text{d}x=\int_{\mathfrak X} \dfrac{\alpha(x)}{f(x)}\dfrac{f(x)}{c}\,\text{d}x=\int_{\mathfrak X} \dfrac{\alpha(x)}{c}\,\text{d}x=\frac{1}{c}$$Therefore, the estimate$$\frac{1}{n}\sum_{i=1}^n \dfrac{\alpha(x_i)}{f(x_i)}\qquad x_i\sim\xi(x)$$is an unbiased and convergent estimator of $1/c$, whatever $\alpha(\cdot)$ is. The only caution in choosing this $\alpha(\cdot)$ density is to ensure that the estimator has a finite variance, for otherwise the outcome is completely untrustworthy.

Xi'an
  • 90,397
  • 9
  • 157
  • 575
  • Neat! Thanks for the insightful response! One question: how is identity $\int_{\mathscr{X}}\frac{\alpha (x)}{f(x)}\xi (x) \text{d}x$ derived? Doesn't importance sampling state $\hat{I}_{IS} = \sum_{i=1}^n \frac{\pi(x_i)}{nq(x_i)}g(x_i)$ where $g(x_i)$ is any function, $\pi(x_i)$ is the pdf, and $q(x_i)$ is the importance density? In your notation is $\alpha (x) = g(x)$, $f(x) = q(x)$, and $\pi(x) = \xi(x)$? –  May 01 '20 at 15:03
  • I don't believe my previous comment is correct. I understand the identity since the integral of $\alpha(x)$ over the domain of a proper pdf must $= 1$, but I don't fully understand how the estimate follows. Wouldn't that be an estimator of $c$ not $1/c$ ie. you're looking at $I = \int_{\mathscr{X}}\alpha(x) \text{d}x$? –  May 01 '20 at 15:54
1

C is the integral of $f(x)$, if you divide $f(x)$ by its integral, the new $f(x)$ will be normalized as a PDF. You can obtain c by sampling uniformly in the support of $f(x)$. If U are the elements of this uniform sampling, c is = mean of $f(u)$ * support.

In the article they compute the integral analytically, so they don't need MC. Here I am showing a way to compute the integral by MC.

Here is an example. The support here is from 0 to 1. f is not normalized, so we obtain c.

f=function(x){2.75*dbeta(x,2,5)}
U=runif(10^7)
c=mean(f(U))*1
c
[1] 2.749145
javierazcoiti
  • 613
  • 4
  • 11
  • What do you mean by support? In the context of my question I have $S = \{x_i\}_{i=1}^n \sim \xi$, which may not be uniform. I don't necessarily know the exact form of $f(x)$ or $\xi$ simply that $\xi = \frac{f}{c}$ and I want to know if I can use $S$ to get $c$ –  Apr 30 '20 at 22:21
  • Thanks for the edit! Second question, what if the support wasn't from 0 to 1? Say something generic like 2 to 3? how does $c = \mathbb{E}[f(u)]*\text{support}$ change? –  Apr 30 '20 at 22:31
  • You're welcome. In that case the example is with a Beta density and I showed how to normalize f. The length of the support is 1 (from zero to 1). For example, in case of a Gaussian, with infinite support, you can still get an "useful" approximation because most of the integral of the PDF is near zero. You could use uniform sampling in the interval [-5,5] and then multiply the mean of f(u) by 10 (the length of the interval), this is of course an approximation. – javierazcoiti Apr 30 '20 at 22:41
  • By doing the mean of f(u) we are obtaining the mean of the height of the density function, and by multiplying it by the support we obtain the area of a rectangle that has the same area as the PDF, this area is the integral. – javierazcoiti Apr 30 '20 at 22:52
  • last question, does the sampling have to be uniform? Obviously different sampling techniques give different results, but is $\mathbb{E}[f(\text{samples})]*\text{support}$ still valid? or does that only work because the uniform random variable has a constant density? ie. to make the rectangle and then compute the integral? –  Apr 30 '20 at 22:53
  • You will not obtain the same, you can try the R code and change the vector U with samples from other distribution and you will obtain a different c. c must be 2.75 because the function dnorm is normalized and I created $f(x)$ so as to have its integral equal to 2.75 (first line of code) – javierazcoiti Apr 30 '20 at 23:07
  • 1
    As said in other answer, you can use other samples other than uniform, but as I said, not in the formula I provided for computing the integral, with other samplings you will not be computing the average height of $f(x)$. – javierazcoiti May 01 '20 at 14:44