3

I'm coding some Bayesian bandits algorithms for exponential families and for the case when my rewards are normally distributed, I need to use an improper uniform prior. Is there any way to represent this in R? I guess I could use runif(1,0,n) and choose some arbitrarily large $n$ but this still only works for drawing from Unif[0,n] so that wouldn't be ideal.

Here's my current code. Once I pull an arm at least once (think of a slot machine), I can sample from the normal posterior. I'm just trying to see how I can formulate a pull from the prior when it's not a probability distribution.

                        select_arm = function() {
                          sampled_theta <- NULL

#first we test if the arm has been pulled. if it hasn't, we draw #from an improper uniform. otherwise, we draw from a normal #distribution

                          for (i in 1:self$num_arms){
                                if (private$trials[i]==0){
                              sampled_theta <- c(sampled_theta,
                                runif(1,min = -10^4, max=10^4))
                            }
                            else {

                            #draw sample from normal distribution for posterior

                            a <- private$scores[i]/private$trials[i]
                            b <- self$variance[i]/private$trials[i]
                            sampled_theta <-c(sampled_theta,rnorm(1,a,sqrt(b) ) )                    
                            }
                          }
Glassjawed
  • 457
  • 3
  • 13
  • `runif(n)` gives you $n$ independent $\mathrm{U}[0,1]$. This is not what you want. You probably want something like `runif(n, min = -10^4, max = 10^4)`. – Zen Dec 01 '14 at 19:10
  • I'm doing Thompson sampling so it's similar. – Glassjawed Dec 01 '14 at 19:41
  • 2
    You cannot sample from an improper because it is not a probability distribution! Using a proper approximation is likely to produce something different. This has nothing to do with R, obviously, but with the very nature of improper priors. When this happens in ABC algorithms, I use an alternative proper distribution that is an approximation to the posterior and then construct importance sampling weights to correct for this substitute. Could you detail in the question what is your targeted distribution? – Xi'an Dec 01 '14 at 20:29
  • I figured as much. I'm trying to simulate thompson sampling for 1D normal bandits using Jeffreys prior. However, Jeffreys prior $f(\mu)$ for $N(\mu,\sigma^2)$ (fixed variance) is an improper prior (uniform w/ infinite support). How did you use importance sampling weights? – Glassjawed Dec 01 '14 at 21:19
  • This article http://en.wikipedia.org/wiki/Thompson_sampling says that you must sample from the posterior. Aren't you confusing both? – Zen Dec 02 '14 at 01:42
  • Nope. I am sampling from the posterior but I need some prior assumptions to guide me with choosing the first pull of each arm. For example, there is a case where we have rewards for each arm $i$ distributed $Bernoulli(p_i)$. However, we first need starting assumptions on $p$ so often we just assume $p_i\sim Beta(1,1)$ for every $i$. Then after the $i$th arm is pulled $n$ times with results $x_1,\ldots,x_n$, we draw $p_i$ from the posterior $p_i|x_1,\ldots,x_n \sim Beta(s_n+1,n-s_n+1)$ where $s_n$ is the total reward in $n$ trials. – Glassjawed Dec 02 '14 at 02:16

0 Answers0