I'm trying to understand Latent Dirichlet Allocation (LDA) to apply on Twitter dataset. I've a dataset with 10k tweets and I've already splitted tweets in six groups. Now I'd extract topic from each group separately but I don't understand very well the concept of "document" in LDA. I can use each group as document (so 6 documents) or I must split groups in a prefixed number of documents (i.e. taking group 1, divide tweets of this group based on the hashtags)?
Thanks