Perplexity for short sentences

Asked Aug 27 '19 at 12:38

Active Aug 27 '19 at 12:44

Viewed 281 times

I have a model that outputs short sentences and want to compare the quality of its outputs for different configurations by computing their perplexities using another model.

I tried to use the gpt-2 model from https://github.com/huggingface/pytorch-transformers but I get perplexities of over 1000 so I am not sure if it makes sense to use these results.

I noticed that when I feed in short sentences from Wikitext-2, the perplexities are also very high. It seems that the language model has a hard time predicting the next word, when only a small context is available.

My questions are:

Should I use the perplexities from gpt-2 or does anyone know a language model that works better with short sequences?
Does it make sense to only take outputs with a specified length into account (say: sentences with 20 words) in order to remove the bias towards long sentences?

I would be happy about any suggestions :)

edited Aug 27 '19 at 12:44

asked Aug 27 '19 at 12:38

dj_rydu

Perplexity for short sentences

0 Answers0