I'm trying to implement a Bayesian Learning/Updating Model (multi-armed bandit) in the following way:
- I'm conducting a survey where respondents can rate items on a 5-point scale.
- I have a total set of 5 items, each respondent gets assigned to only one item.
- The idea now is that items with a higher rating from previous respondents get assigend with a higher likelihood to subsequent respondents.
To determine which items are supposed to be shown, I'm using Bayesian Learning.
I might be completely wrong, but since I'm assuming that my items are normally distributed (I know...it's a 5-point scale) and I neither know the mean nor the variance of each item in the population, I'm under the impression that I have to use an inverse Gamma distribution to get the posterior distributions for each respondent based on the values from the previous respondents.
I consulted this post here already: Bayesian updating with new data, but this is for cases where population mean and variance are known (which is not the case in my example).
So as far as I understand, what I would do is:
- Collect an initial amount of ratings per item (let's say 10 respondents for each of the 5 items).
- I use this information (mean and variance) as prior for subsequent respondents.
But from there on I'm a bit lost. So my questions are:
- How do I setup the start values for the inverse gamma hyperparameters (mean, variance (or precision), alpha, beta)?
- How do I update the IG values for each new respondent?