13

I'm working on something like the following problem. I have a bunch of users and N books. Each user creates an ordered ranking of all the books he's read (which is likely a subset of the N books), e.g., Book 1 > Book 40 > Book 25.

Now I want to turn these individual user rankings into a single ordered ranking of all the books.

Are there any good or standard approaches to try? So far, I'm thinking of Bradley-Terry models applied to pairwise comparisons, but I'm wondering if there's anything else.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
raegtin
  • 9,090
  • 12
  • 48
  • 53
  • 2
    I would think you would run into a lot of problems with sparsity, unless the users (for some reason) read similar books. But given n people, and given that most people read relatively few books, the vast majority of pairs will have only one person making the comparison. (The median number was 6 per person per year see [Pew](http://libraries.pewinternet.org/2012/12/27/e-book-reading-jumps-print-book-reading-declines/) – Peter Flom Sep 28 '13 at 01:05
  • 2
    (+1) raegtin, you ask nice, interesting questions. I am fond of BT models, but it seems a tad forced here. Are you familiar with the collaborative filtering literature? It's not the same problem, but some concepts and techniques could be borrowed. A question left unaddressed here is why one should believe the books can be given an unambiguous, well-defined ranking in the first place. (How would you handle the two-user, two-book case, for example?) – cardinal Sep 28 '13 at 03:19
  • @Peter Flom: Correct, most pairs have no comparisons. But I'm hoping that's fine, since if you know that A > B and B > C, then even if A and C aren't directly compared, you can infer A > C. – raegtin Sep 29 '13 at 03:59
  • @cardinal: Yep, BT models seem forced here, but it's the only thing I can think of right now. I'm familiar with the collaborative filtering literature, but I'm not sure how it applies here, since I want rankings, not similarities. It's true that a single global ranking doesn't necessarily make sense (e.g., does it make sense to compare children's books vs. adult books? fiction vs. non-fiction?), but practically, it's still useful. "Best of" book lists pop up all the time :) – raegtin Sep 29 '13 at 04:02
  • Also, I don't care so much about close orderings (e.g., whether book ranked #1 is truly better than book #2), but rather orderings in aggregate (e.g., I do want the top 10% of books in my ordering to be better than the bottom 10% or the middle 10%). – raegtin Sep 29 '13 at 04:02
  • Yes, but the sparseness may be more perverse. What if Joe ranks B > A, Jim ranks A > B and those are only two ranking that pair? Then multiply out by all the pairs. – Peter Flom Sep 29 '13 at 11:31
  • What if you use something like the ELO rating? – chuse Dec 10 '14 at 12:24
  • I believe the problem is known as rank aggregation. – Christian Lindig Jul 23 '20 at 21:29

2 Answers2

2

If you're interested in use (more than in development), you should give a try to rankade, our ranking system.

Rankade is free and easy to use, and it's different from Bradley-Terry model and Elo ranking system (here's a comparison) because it can manage matches with 2+ factions (i.e. books, in your scenario). Inserting user's ordered rankings (as matches between two or more books, with detailed final standings, including ties) you'll obtain the single ordered ranking of all the books you're looking for. In addiction, rankade give you the opportunity to check time evolution for books ranking, and stats for books match-ups, and more.

Tomaso Neri
  • 766
  • 5
  • 18
  • 3
    You ought to describe your algorithm, at least generally, as approach. And link to a paper where it is described in full. Otherwise your answer might be considered as simply an ad. – ttnphns Feb 18 '16 at 14:04
  • 1
    I added a link for a simple comparison between *ree* and most known ranking system. First statement says _If you're interested in use (more than in development)_, so it's proposed as a _solution_ for the problem (rankade features a GUI, while Bradley-Terry and Plackett-Luce need implementation to be used), more than a path to reach the requested solution. – Tomaso Neri Feb 18 '16 at 23:26
1

Plackett-Luce ranking models deal with this problem and are a likelihood based technique where the likelihood is maximized using a majorization-maximization routine, which is similar to Expectation Maximization, in the sense that they use an auxiliary objective function over the likelihood function which is optimized to guarantee iterative monotonic maximization of the likelihood function. (see MM algorithms for Plackett-Luce ranking models by David Hunter). He provides code as well.

From a ranking perspective, they are an extension of Bradley-Terry models which you mention in your post. Bradley-Terry models estimate a global ranking from a sample of pairwise rankings. Plackett-Luce models extend this to rankings of length $>=$ 2. They also allow for each sample being a ranking of a different length.

This fits your dataset perfectly:

Book 1 > Book 40 > Book 25

Book 40 > Book 30

Book 25 > Book 17 > Book 11 > Book 3 etc.

hearse
  • 2,355
  • 1
  • 17
  • 30