1

Lets say I have movie ratings from different users for multiple films. I can find the beta distribution that best fits all the ratings. I can also find the beta distribution that best fits the ratings for a particular movie. What I would like is to be able to say: "If I pull a rating of a movie out of a hat that is a 3, but I don't know what the movie is or who the user is, what is the most likely distribution of their ratings of movie X?".

Meaning that because people who have rated one movie lower are likely to rate movies lower generally, I should generally get a distribution of probable ratings for movie X with a lower mean.

I can take a movie and get distributions for that movie given a user has rated a movie a 1, a 2 a 3 etc. But this would mean that close but not exact ratings would not support neighbouring distributions and continuous ratings would not work at all.

What would be best is a surface plot probability mass function where the y axis is the given rating of an unknown film, where if I took a slice at that y, I would get a Beta distribution where x is ratings and z is probability.

Does such a distribution exist and how would I best compute the most likely parameters for the distribution?

  • 1
    I didn't fully grasp your question (pardon), but the multivariate Beta distribution is usually taken to be the Dirichlet distribution. This holds is the outcomes must sum to 1. Many other possibilities also (probably) exist given other details of your question. – Firebug Dec 07 '17 at 19:03
  • So you want a multivariate distribution where the marginal distributions are Beta? – Gabriel J. Odom Dec 07 '17 at 19:14
  • I'm not that familiar with statistics any more and especially not the Dirichlet. Maybe Dirichlet is the the correct distribution? To answer the second question, yes I think that is what I want. – Joe Cordingley Dec 07 '17 at 19:35
  • Dirichlet it is then! – Xi'an Dec 07 '17 at 21:15
  • I looked at Dirichlet briefly but I was put off by the fact that the surface plot was imposed on a triangle. I was thinking my surface would be a square. With the sample rating of the unknown movie on one side and the expected rating of the target movie on the other. I will look into it properly though. – Joe Cordingley Dec 08 '17 at 07:54
  • @JoeCordingley The triangle representation works when you have three compositional elements which sum to one. – Firebug Jan 29 '18 at 15:59
  • For some illustrative demonstrations, see [this blog post](http://bascornelissen.nl/2017/01/01/bags.html). – Firebug Jan 29 '18 at 16:02
  • Such distribution, for instance $(X,Y)$ random variable where each is beta, marginally. But there is no canonical choice as in the case of the normal distribution, there are many possibilities. One possibility is to use copulas. – kjetil b halvorsen May 15 '19 at 21:56
  • Re "does such a distribution exist:" see https://stats.stackexchange.com/questions/87358. – whuber Oct 21 '21 at 21:45

0 Answers0