Highest Voted 'perplexity' Questions - Statistical Analysis Stack Exchange

53

votes

4 answers

What is perplexity?

I came across term perplexity which refers to the log-averaged inverse probability on unseen data. Wikipedia article on perplexity does not give an intuitive meaning for the same. This perplexity measure was used in pLSA paper. Can anyone explain…

asked May 04 '11 at 06:04

Learner

4,007
11
37
39

11

votes

2 answers

Perplexity and cross-entropy for n-gram models

Trying to understand the relationship between cross-entropy and perplexity. In general for a model M, Perplexity(M)=2^entropy(M) . Does this relationship hold for all different n-grams, i.e. unigram, bigram etc.?

natural-language entropy perplexity

asked Jun 16 '17 at 19:31

Margalit

111
1
4

7

votes

2 answers

Why does larger perplexity tend to produce clearer clusters in t-SNE?

Why does larger perplexity tend to produce clearer clusters in t-SNE? By reading the original paper, I learned that the perplexity in t-SNE is $2$ to the power of Shannon entropy of the conditional distribution induced by a data point. And it is…

clustering dimensionality-reduction tsne perplexity

asked Mar 28 '19 at 07:38

meTchaikovsky

1,414
1
9
23

7

votes

2 answers

Intuition behind perplexity parameter in t-SNE

While reading Laurens van der Maaten's paper about t-SNE we can encounter the following statement about perplexity: The perplexity can be interpreted as a smooth measure of the effective number of neighbors. The performance of SNE is fairly robust…

dimensionality-reduction tsne perplexity

asked Nov 28 '18 at 10:37

Kuba_

171
4

4

votes

1 answer

Inferring the number of topics for gensim's LDA - perplexity, CM, AIC, and BIC

I am confused as to how to interpret the LDA's perplexity fluctuations with different numbers of topics, in the endeavour of determining the best number of topics. Additionally, I would like to know how to implement AIC/BIC with gensim LDA models. I…

aic topic-models latent-dirichlet-alloc latent-class perplexity

asked Jan 12 '18 at 15:01

Jabro

361
2
12

4

votes

1 answer

Language Model compare probability scores between Length varying sentence

My question is : How can I compare Language Model(LM) score for two sentences with different lengths ? Probabilities are < 1, and since LM scores for a sentence are multiple of probability of bigram or trigram, depending upon it's a bigram or…

probability normalization natural-language language-models perplexity

asked Sep 05 '17 at 22:12

pseudo_teetotaler

253
1
8

3

votes

1 answer

Breaking substitution cipher with language model

Frequency analysis is a common tool used to break substitution ciphers, but often relies on intuition and guesswork of a human. Since language models can objectively calculate perplexity (how surprising a piece of language seems), they seem like a…

machine-learning natural-language language-models perplexity

asked Jan 02 '21 at 21:15

Christian Doucette

185
3

2

votes

1 answer

Calculating perplexity with smoothing techniques (NLP)

This question is about smoothed n-gram language models. When we use additive smoothing on the train set to determine the conditional probabilities, and calculate the perplexity of train data, where exactly is this useful when it comes to the test…

natural-language language-models laplace-smoothing perplexity

asked Jun 01 '21 at 11:01

Janani K

23
2

2

votes

1 answer

Perplexity formula in the t-SNE paper vs. in the implementation

The perplexity formula in the official paper of t-SNE IS NOT the same as in its implementation. In the implementation (MATLAB): % Function that computes the Gaussian kernel values given a vector of % squared Euclidean distances, and the precision of…

matlab entropy tsne perplexity

asked Mar 11 '19 at 15:41

ElegantLogic

33
3

2

votes

1 answer

Why do I get weird results when using high perpexity in t-SNE?

I played around with the t-SNE implementation in scikit-learn and found that increasing perplexity seemed to always result in a torus/circle. I couldn't find any mentions about this in literature. Check out some examples below, which is just a…

scikit-learn tsne manifold-learning perplexity

asked Mar 08 '18 at 10:57

Mathias Andersen

145
6

2

votes

1 answer

How should perplexity of LDA behave as value of the latent variable k increases?

When increasing the value of the latent variable k for LDA (latent Dirichlet allocation), how should perplexity behave: On the training set? On the testing set?

latent-variable latent-dirichlet-alloc perplexity

asked Oct 02 '17 at 03:00

user179041

80
1
8

1

vote

0 answers

Perplexity for short sentences

I have a model that outputs short sentences and want to compare the quality of its outputs for different configurations by computing their perplexities using another model. I tried to use the gpt-2 model from…

language-models perplexity

asked Aug 27 '19 at 12:38

dj_rydu

11
3

1

vote

0 answers

Why does perplexity change with different ranges of k?

I ran a 5-fold cross-validation in R to calculate LDA perplexity for k = 2:9 using a 10% sample of my data. The output was: 2 3 4 5 6 7 8 9 156277 139378 71659 68998 67471 32890 32711 31904 I re-ran…

r cross-validation topic-models latent-dirichlet-alloc perplexity

asked Apr 14 '19 at 19:04

gserapio

11
2

1

vote

1 answer

Word perplexity on a subword language model

Let's have corpora $X = x_1...x_N$ in which every word can be represented using subwords (from a fixed size vocabulary of subwords) $x_i = x_{i,0}...x_{i,M(x_i)}$ where $M(x_i)$ is number of subwords to which the word is divided. For a word language…

probability perplexity corpus-linguistics

asked Apr 12 '19 at 15:49

wswin

21
1

1

vote

1 answer

LDA and test data perplexity

I've performed Latent Dirichlet Analysis on a training set of documents. At the ideal number of topics I would expect a minimum of perplexity for the test dataset. However, I find that the perplexity for my test dataset increases with number of…

scikit-learn latent-dirichlet-alloc perplexity

asked Aug 29 '18 at 09:39

BHC

141
1
3

Questions tagged [perplexity]