1

I get the gist of Thompson sampling for price optimisation (I think - see this video around minute 31). I wonder, would Thompson sampling require discriminative pricing or can prices be change sequentially during the day? Thanks.

cs0815
  • 1,294
  • 18
  • 30
  • What do you mean by discriminative pricing? – Tim Nov 07 '21 at 20:40
  • It means that n customers could encounter m different prices at a/the same point in time. This illegal in many countries - also the EU. – cs0815 Nov 07 '21 at 20:43

1 Answers1

1

If by discriminative pricing you mean

that n customers could encounter m different prices at a/the same point in time

then it's not a problem. You would sample the price for a batch of customers and given the outcomes, you would make the Bayesian update. For example, in the case of Bernoulli bandit, for a single customer, if no purchase was observed, the update would be $\mathsf{Beta}(\alpha, \beta+1)$, if you observed $m$ purchases and $k$ cases without purchase, the update would just be $\mathsf{Beta}(\alpha + m, \beta+k)$. The same applies to continuous rewards, you just need to make an update that considers multiple samples.

Tim
  • 108,699
  • 20
  • 212
  • 390
  • Thanks. So 1 price per batch, let us n hours? Then change price for next n hours? – cs0815 Nov 07 '21 at 21:43
  • 1
    @cs0815 how you conduct the experiment would depend on many technical, business, or legal reasons. What I'm saying is that it is technically possible to do the multi-sample updates, no reason why this would not be possible. – Tim Nov 07 '21 at 21:50