I get the gist of Thompson sampling for price optimisation (I think - see this video around minute 31). I wonder, would Thompson sampling require discriminative pricing or can prices be change sequentially during the day? Thanks.
Asked
Active
Viewed 24 times
1
-
What do you mean by discriminative pricing? – Tim Nov 07 '21 at 20:40
-
It means that n customers could encounter m different prices at a/the same point in time. This illegal in many countries - also the EU. – cs0815 Nov 07 '21 at 20:43
1 Answers
1
If by discriminative pricing you mean
that n customers could encounter m different prices at a/the same point in time
then it's not a problem. You would sample the price for a batch of customers and given the outcomes, you would make the Bayesian update. For example, in the case of Bernoulli bandit, for a single customer, if no purchase was observed, the update would be $\mathsf{Beta}(\alpha, \beta+1)$, if you observed $m$ purchases and $k$ cases without purchase, the update would just be $\mathsf{Beta}(\alpha + m, \beta+k)$. The same applies to continuous rewards, you just need to make an update that considers multiple samples.

Tim
- 108,699
- 20
- 212
- 390
-
Thanks. So 1 price per batch, let us n hours? Then change price for next n hours? – cs0815 Nov 07 '21 at 21:43
-
1@cs0815 how you conduct the experiment would depend on many technical, business, or legal reasons. What I'm saying is that it is technically possible to do the multi-sample updates, no reason why this would not be possible. – Tim Nov 07 '21 at 21:50