Highest Voted 'latent-semantic-indexing' Questions - Statistical Analysis Stack Exchange

9

votes

1 answer

Understanding Singular Value Decomposition in the context of LSI

My question is generally on Singular Value Decomposition (SVD), and particularly on Latent Semantic Indexing (LSI). Say, I have $ A_{word \times document} $ that contains frequencies of 5 words for 7 documents. A = matrix(data=c(2,0,8,6,0,3,1, …

r svd natural-language latent-semantic-indexing

asked Jul 16 '14 at 12:31

Zhubarb

7,753
2
28
44

5

votes

2 answers

Latent Dirichlet Allocation vs. pLSA

In the original LDA paper it is stated that: The parameters for a k-topic pLSI model are k multinomial distributions of size V and M mixtures over the k hidden topics. This gives kV +kM parameters and therefore linear growth in M. The linear…

overfitting latent-variable dirichlet-distribution latent-semantic-analysis latent-semantic-indexing

asked Jun 07 '15 at 11:34

Shayan

231
3
8

4

votes

1 answer

Latent Semantic Indexing and Data Centering

In PCA it's common to center the data, i.e. preprocess the data matrix such that the columns have zero mean. PCA can be done via SVD, but in this case the data matrix also has to be mean-centered. If we don't center it, the found principal…

pca svd non-negative-matrix-factorization latent-semantic-analysis latent-semantic-indexing

asked May 18 '15 at 18:03

Alexey Grigorev

8,147
3
26
39

3

votes

0 answers

Difference between Latent and Explicit Semantic Analysis

I'm trying to analyse the paper ''Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis''. One component of the system described therein that I'm currently grappling with is the difference between Latent and Explicit…

machine-learning natural-language latent-semantic-indexing

asked May 14 '15 at 09:24

smatthewenglish

141
5

3

votes

1 answer

How are the clustering algorithms using the concept of Latent Semantic Analysis?

I have come across Latent Semantic Analysis, but I could not understand it. Can Latent Semantic Analysis be used by humans in clustering of the data-sets? For convenience let us consider the datasets to be a two dimensional sets. Can the humans…

latent-semantic-analysis latent-semantic-indexing

asked Mar 18 '15 at 13:41

Ramseyl

51
1
5

2

votes

0 answers

What are some of the advantages and disadvantages of Explicit Semantic Analysis (ESA)?

I am writing a report semantic analysis and I have come across a celebrated paper Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis by Evgeniy Gabrilovich and Shaul Markovitch. I have been looking at this paper and some…

latent-semantic-indexing

asked Sep 23 '16 at 15:58

silent_dev

557
1
6
16

2

votes

1 answer

How to Use LSA Create Topics?

Just want to know the general process of creating document topics via LSA. For creating document clusters, I know first I should get SVD dimensions and then use k-means clustering on these SVD dimensions to create document clusters. For creating…

natural-language topic-models latent-semantic-indexing

asked Jun 25 '16 at 18:56

kokoma

99
4

2

votes

0 answers

Supervised semantic analysis

Dimensional reduction and semantic vectorization techniques like LSA, pLSA, LDA and Random Indexing do not take advantage of semantic labeled data like Explicit Semantic Analysis (ESA). I am looking for state of art of supervised semantic analysis…

machine-learning latent-semantic-analysis latent-semantic-indexing

asked Apr 02 '16 at 18:40

hernan

61
3

1

vote

0 answers

Semantic Analysis: Set a default value for examples not in scope of the training set

I am working with a semantic analysis problem and wanted to know if anyone has been able to set a default value, say a probability of zero or 0.5 for phrases/words that the machine learning algorithm has never seen. Using scikit-learn's classifiers…

machine-learning latent-semantic-indexing

asked Mar 26 '18 at 23:52

MyopicVisage

133
6

1

vote

0 answers

Latent class in Gaussian mixture model

I would like to get any advice on the latent class in the mixture model. But i wish to do latent code by hand without relying on the existing R package. This is my snippet code to do the finite mixture: no<-nrows(myData.obs) prob1 =…

r normal-distribution latent-semantic-indexing

asked Dec 31 '16 at 10:43

Jas

11
3

1

vote

0 answers

Is Latent Semantic Analysis a clustering algorithm?

The input of LSA is a term frequency matrix of a set of documents. What's the output? If I want to cluster a bunch of news into different clusters, can I use LSA? If not, what's the major uses of LSA? Is it similar to K-means?

k-means latent-semantic-indexing

asked Nov 26 '16 at 06:13

user697911

121
4

1

vote

0 answers

Calculating perplexity for LSA

I am new to topic modelling, so kindly bear me if my question is silly. I am trying to calculate perplexity after applying LSA. i am aware that LSA returns negative values, so i followed the steps stated in coccaro to find the propability of each…

machine-learning topic-models latent-semantic-indexing

asked Oct 24 '16 at 10:30

Hemaa mathavan

121
6

1

vote

0 answers

LDA or pLSA for short documents?

I'd like to classify short documents, from a predefined set of words. What algorithm would you suggest, LDA or pLSA ? My use case I have a list of users, and for each user a list of the pages she likes. My goal is to classify users (documents) into…

topic-models dirichlet-process latent-semantic-analysis latent-semantic-indexing

asked Sep 07 '15 at 07:28

Uri Goren

1,701
1
10
24

1

vote

0 answers

Latent semantic analysis and keyword extraction

Well, I've started with a collection of documents. The aim is to extract keywords for each document. I've made a document-term matrix, to which I applied an singular value decomposition. I've made a new matrix (an approximation of the Original…

latent-semantic-indexing

asked May 01 '15 at 10:04

Silke

285
2
13

0

votes

0 answers

How to intepret negative coefficients in LSI model?

I am using Gensim 4 to train a LSI model. I tried to print some topic, and this is one of the result: -0.246*"cancer" + -0.218*"patient" + -0.200*"risk" + -0.131*"ci" + -0.122*"associ" + -0.120*"breast" + -0.118*"women" + -0.114*"atlas" +…

interpretation latent-semantic-analysis latent-semantic-indexing

asked Jul 05 '21 at 17:13

robertspierre

1,358
6
21

Questions tagged [latent-semantic-indexing]