2

In a usual textual description (according to SMC in Practice book ) of a SMC algorithm for State-Space models, we usually expand the particles according to the distribution from the transition equation, or from a proposal/importance distribution, then we compute the normalised weights and do a resampling according to those weights. The description puts the resampling at the end of an iteration.

However, in an article called Particle Markov Chain Monte Carlo, the authors first proceed to initialise the weights, for a particles of size one, then choose the ancestors indexes, which is equivalent to a resampling and then do they extend the particles. The description puts the resampling at the begining of the iteration.

Why would putting the resampling at the end or at the beginning be equivalent?

The weights we'll compute will be different, which will result in different probabilities when we're doing the resampling. Does this have no bearing on how the algorithm works?

An old man in the sea.
  • 5,070
  • 3
  • 23
  • 57
  • Are you asking why sometimes the description puts the resampling at the end of an iteration and sometimes it puts it at the beginning? Also, why are the weights different because of this? – Taylor Aug 25 '18 at 01:20
  • @Taylor yes, and why, intuitively, it's the same or equivalent... I think in reality, the first few iterations will be different, but with the burn-in, and because asymptotically they'll converge to the same distribution, there's an equivalence. Sorry for my english in the question. ;) – An old man in the sea. Aug 25 '18 at 07:26
  • @Taylor I've integrated your comment into my question. Thanks ;) – An old man in the sea. Aug 25 '18 at 07:31

1 Answers1

1

Computationally, it's easy to see that they're equivalent. If your data are $y_1, y_2, y_3$, and your states are $x_1, x_2, x_3$, and if you are putting resampling at the beginning of each step (except the first step, because you can't), then your algorithm would basically look like S(RS)(RS) where S stands for sample, and R stands for resample.

If you followed the convention where resampling comes at the end of every step, it would go something like this: (SR)(SR)(S-). You don't resample at the very end, because it just adds noise (resampling is only beneficial for future observations).

So they're the same if you just re-arrange the parentheses. This is just a guess, but the first description is probably easier to describe, because you don't need an extra if condition for the last iteration. I usually follow the second convention, myself.

What matters is what observations you're using to adjust weights. For example, in this book, in chapter 9.4, they describe separately the sequential importance sampling with resampling (SISR) algorithm and something they call the "IID sampling" algorithm. They are compared on the basis of when resampling is performed; the first resamples second, and the second resamples first. However, what's important is that the second algorithm reweights at time $t$ based on observations from $t+1$ prior to sampling a time $t+1$ particle. On the other hand, SISR samples from time $t$ to $t+1$, and then resamples based afterwards based on weights that use the observation at time $t+1$. It turns out SISR has greater asymptotic variances than those had by IID sampling.

This is the motivation for selection of any proposal distribution in this class of algorithms: taking into account future observations is generally a good idea, whether you're using it to resample prior to mutating, or whether you're doing that to help choose a proposal distribution (e.g. using a locally optimal proposal, or approximating it with the auxiliary particle filter).

Taylor
  • 18,278
  • 2
  • 31
  • 66