1

I'd like to classify short documents, from a predefined set of words.

What algorithm would you suggest, LDA or pLSA ?

My use case

I have a list of users, and for each user a list of the pages she likes.

My goal is to classify users (documents) into classes (topics).

The documents are short, since a user is not likely to to like more than dozens of pages, and there are approximately 100k pages I care about

Uri Goren
  • 1,701
  • 1
  • 10
  • 24
  • Could you give some more detail, like what is your goal? For what use is the classification? Is the classes also predefined, making it a classification problem, or undefined/unknown, making it into a clustering problem? Without some more information this cannot be answered. – kjetil b halvorsen Sep 07 '15 at 13:58
  • added some more data on my use case – Uri Goren Sep 08 '15 at 09:01
  • Topic modeling for short documents is discussed here: http://stats.stackexchange.com/questions/25558/topic-models-for-short-documents – Piotr Migdal Mar 06 '16 at 11:58

0 Answers0