5

Let's imagine I have a machine giving me an independent random number from a normal distribution $N(\mu,1)$ whenever I push a button. I have a stopping rule to decide how many samples to collect - I will collect samples until the sum of observations exceeding 1.

Mathematically, my stopping rule is to collect $N$ samples where $N$ is a stopping time defined as $N = \inf\{n \geq 1 : \sum_{i=1}^n X_i > 1\}$ where $X_1, X_2, \dots \sim N(\mu, 1)$. Let $D_N := \{X_1, \dots, X_N\}$ be the observed data set.

My question is that if I put a standard normal prior on $\mu$, that is if $\mu \sim N(0,1)$, what is the posterior mean of $\mu$ given $D_N$?

If I simply collected a fixed $n$ numbers of samples, the answer is $$ \mathbb{E}[\mu |D_n] = \frac{n}{n+1}\bar{X}_n, $$ where $\bar{X}_n$ is the sample mean based on $n$ observations.

Intuitively, I think for a randomly stopped $D_N$ the posterior mean should be similar which is given as $$ \mathbb{E}[\mu |D_N] = \frac{N}{N+1}\bar{X}_N. $$ However, I cannot rigorously justify it since it is unclear how to define the likelihood function of $\mu$ given the randomly stopped data set $D_N$.

Thanks in advance for your helpful answers!

Taylor
  • 18,278
  • 2
  • 31
  • 66
JaeHyeok Shin
  • 576
  • 2
  • 6

1 Answers1

1

The system you are describing is an asymmetrical Gaussian random walk with unknown $\mu$. Try checking out some related questions like this one, although your problem is significantly harder.

I think the key is to figure out $P(N=n|\mu)$ (probability of stopping after $n$ samples) and $P(D_N|N=n,\mu)$ (probability of your observing your sampled values $D_N$). Then you can compute the conditional likelihood of $\mu$ as:

$$ P(\mu | D_N) = \frac{P(D_N | N=n, \mu) \times P(\mu)}{P(D_N)} $$

timchap
  • 171
  • 4
  • 1
    Thanks! Yes, the most tricky part is $P(D_N | N = n, \mu)$ has no simple form in general since condition on $N = n$, all observations are dependent to each other. – JaeHyeok Shin Jul 11 '19 at 21:32