Highest Voted 'latent-dirichlet-alloc' Questions - Statistical Analysis Stack Exchange

30

votes

4 answers

R packages for performing topic modeling / LDA: just `topicmodels` and `lda`

It seems to me that only two R packages are able to perform Latent Dirichlet Allocation: One is lda, authored by Jonathan Chang; and the other is topicmodels authored by Bettina Grün and Kurt Hornik. What are the differences between these two…

asked Mar 10 '12 at 15:47

bit-question

2,637
6
25
26

11

votes

0 answers

Is sparsity of topics a necessary condition for latent Dirichlet allocation (LDA) to work

I have been playing with the hyper-parameters of the latent Dirichlet allocation (LDA) model and am wondering how sparsity of topic priors play a role in inference. I have not performed these experiments on real data, but on simulated data. I…

dirichlet-distribution topic-models latent-dirichlet-alloc

asked Mar 07 '17 at 21:14

kedarps

2,902
2
19
30

10

votes

1 answer

How does topic coherence score in LDA intuitively makes sense ?

referring to: http://qpleple.com/topic-coherence-to-evaluate-topic-models/ In order to decide the optimum number of topics to be extracted using LDA, topic coherence score is always used to measure how well the topics are extracted: $CoherenceScore…

topic-models latent-dirichlet-alloc

asked Nov 02 '18 at 20:58

Kid_Learning_C

247
1
2
7

8

votes

1 answer

Using topic words generated by LDA to represent a document

I want to do document classification by representing each document as a set of features. I know that there are many ways: BOW, TFIDF, ... I want to use Latent Dirichlet Allocation (LDA) to extract the topic keywords of EACH SINGLE document. the…

feature-selection text-mining topic-models latent-dirichlet-alloc

asked Sep 06 '14 at 15:36

Munichong

1,645
3
15
26

7

votes

1 answer

Reasonable hyperparameter range for Latent Dirichlet Allocation?

What are good ranges for the hyperparameters $\alpha$ and $\beta$ (explained well here) in LDA? I appreciate hyperparameter tuning always depends on the use case, data, content of documents etc., but is there any general rule or heuristic to choose…

hyperparameter latent-dirichlet-alloc

asked Jun 04 '18 at 17:20

PyRsquared

1,084
2
9
20

7

votes

1 answer

How does LDA (Latent Dirichlet Allocation) assign a topic-distribution to a new document?

I am new to topic modeling and read about LDA and NMF (Non-negative Matrix Factorization). I understand the training process work. Let's say I have 100 documents and I want to train an LDA for these documents with 10 topics. However, I don't really…

natural-language topic-models latent-dirichlet-alloc non-negative-matrix-factorization

asked Jan 29 '18 at 13:07

nickg

71
1
3

7

votes

1 answer

Clustering with Latent dirichlet allocation (LDA): Distance Measure

Since a similarity/distance measure is crucial for every clustering algorithm, I wonder what this measure is for LDA. Since LDA works on text as a bag-of-word model, can someone imagine the similarity between topics (clusters) are the representative…

clustering distance similarities latent-dirichlet-alloc

asked Jul 19 '17 at 09:41

Lisa

73
1
1
4

7

votes

0 answers

How to use LDA to predict topic proportion for new document?

I'm interested to learn how I can use a trained LDA (Latent Dirichlet Allocation) model to make predictions on the topic proportion of a new, unseen document using Naive Bayes. Let $z \in \{1, 2, ..., Z\}$ denote a particular topic (there's $Z$…

bayesian sampling naive-bayes topic-models latent-dirichlet-alloc

asked May 24 '17 at 04:09

zzhengnan

171
3

6

votes

1 answer

What's the relation between Matrix Factorization (MF) and Latent Dirichlet Allocation (LDA)?

My understanding is that both MF and LDA can be applied to do document classification. I will first summarize my understand about these two methods before I ask my questions. Assuming we use a big matrix X to summarize the documents in a corpus and…

machine-learning text-mining matrix-decomposition latent-dirichlet-alloc

asked Nov 12 '17 at 03:57

cwl

719
3
19

5

votes

1 answer

How to use LDA to classify documents into pre defined topics

LDA is unsupervised and it classifies documents into topics. But, is there a way to make the LDA classify the documents into the predefined (or specific desired) topics. Below link says we need custom beta prior where we provide more weights to some…

natural-language text-mining latent-dirichlet-alloc

asked Oct 17 '20 at 03:19

tjt

687
4
13

5

votes

3 answers

What is the labels for SVM classification when we firstly run LDA (lda->SVM)

I am using LDA (Latent Dirichlet Allocation) to extract topics. I want to do topic modelling and use the topics as features to do document classification. the reason for doing classification is to evaluate my LDA model. the same as this link lda ,…

svm text-mining topic-models latent-dirichlet-alloc

asked Aug 07 '17 at 06:35

sariii

228
1
12

4

votes

1 answer

Understanding Latent Dirichlet Allocation Inference

I'm reading the wikipedia page about how Latent Dirichlet Allocation assigns a topic distribution to a document after the model's been learnt (see this link). I'm very confused by this part of it: Let $n_{j,r}^i$ be the number of word tokens in…

inference topic-models latent-dirichlet-alloc

asked Feb 27 '14 at 22:54

Andrew

203
2
8

4

votes

0 answers

LDA implementaion in pymc3

I am implementing LDA with pymc3 using the referred code for pymc from the post Latent Dirichlet Allocation in PyMC I am trying to use it for pymc3 bt having problems defining w import numpy as np import pymc3 as pm, theano, theano.tensor as t K…

python pymc latent-dirichlet-alloc

asked Jul 16 '18 at 19:09

Anil Gaddam

41
3

4

votes

1 answer

LDA vs. labeled LDA

I have gone through the techniques and understood the basic ideas. But I want to know which one usually is expected to work better, LDA or Labeled LDA? What are the features of the dataset that help decide amongst the two?

machine-learning data-mining natural-language topic-models latent-dirichlet-alloc

asked Jul 25 '12 at 21:37

Rohit Jain

143
4

4

votes

1 answer

Inferring the number of topics for gensim's LDA - perplexity, CM, AIC, and BIC

I am confused as to how to interpret the LDA's perplexity fluctuations with different numbers of topics, in the endeavour of determining the best number of topics. Additionally, I would like to know how to implement AIC/BIC with gensim LDA models. I…

aic topic-models latent-dirichlet-alloc latent-class perplexity

asked Jan 12 '18 at 15:01

Jabro

361
2
12

Questions tagged [latent-dirichlet-alloc]