Questions tagged [text-summarization]

Text summarization is the process of reducing a text document in order to create a summary that retains the most important points of the original document.

23 questions
17
votes
2 answers

Interpreting ROUGE scores

I recently read the paper on Salesforce's advances in abstractive text summarisation. This states that the ROUGE-1 score achieved of 41.16 is significantly better than the previous state of the art. I also read this paper on (mainly extractive) text…
Alan Buxton
  • 275
  • 1
  • 2
  • 8
10
votes
2 answers

Log-likelihood ratio in document summarization

I initially asked this on stack overflow and was referred to this site, so here goes: I am implementing some unsupervised methods of content-selection/extraction based document summarization and I'm confused about what my textbook calls the…
Richard
  • 101
  • 1
  • 3
5
votes
1 answer

Summarization of text documents (legal domain) using deep learning techniques

I am referring to the site deeplearning.net on how to implement the deep learning architectures. I have read quite a few research papers on document summarization (both single document and multidocument) but I am unable to figure to how exactly the…
4
votes
1 answer

Gaming the ROUGE metric for text summarization

ROUGE seems to be the standard way of evaluating the quality of machine generated summaries of text documents by comparing them with reference summaries (human generated). $$ROUGE_{n}= \frac {\sum_{s\in \textrm{Ref Summaries} } \sum_{gram_{n}\in…
wabbit
  • 370
  • 2
  • 13
3
votes
0 answers

Do supervised methods outperform unsupervised methods for generic multi-document summarization of news?

{1} says: For generic multi-document summarization of news, supervised methods have not been shown to outperform competitive unsupervised methods based on a single feature such as the presence of topic words and graph methods. The paper was…
Franck Dernoncourt
  • 42,093
  • 30
  • 155
  • 271
2
votes
1 answer

ROUGE scores for extractive vs abstractive text summarization

The ROUGE score (scores) allows us to measure (although not in a perfect way) the quality of our text summarization by computing the frequency of overlapping n-grams between our produced summary and the reference one (or ones - usually created by…
2
votes
1 answer

How to test generated text

I am creating a text generation algorithm for my master's research. I have a dialogue between two people and I would like to simulate one part of the conversation with naturally generated text (not templated text). For the training, I'm simply using…
2
votes
1 answer

Machine learning for product names

I have a machine learning challenge I may be over thinking. I have a set of 3.5 million products (not unique, there are multiple instances of each product). Each product has a "description" from it's manufacturer which is generally a long strong of…
2
votes
1 answer

What other approaches are there for abstractive summarization, other then seq2seq?

I'm researching on abstractive text summarization, and has come across many recent papers. They all seem to be focusing on Sequence to Sequence models based on RNNs. Apart from RNNs, what other approaches are there when it comes to abstractive text…
2
votes
0 answers

How to determine summary like tables on any informative web (html) page

I am struggling with determining the best way to guess which table (if any) on a given web page is the summary table. Examples would be the first, right-side tables on these pages. http://wikitravel.org/en/China#quickbar…
2
votes
1 answer

Document summarization with Log-likelihood ratio

I am trying to implement a text summary using Log-Likelihood Ratio. As explained in https://www.cs.bgu.ac.il/~elhadad/nlp16/nenkova-mckeown.pdf under section 2.1 I do not understand what do they really mean by background and foreground corpus. In…
2
votes
2 answers

Recall and precision in text summarization

As you know extractive text summarization is a binary classification problem!(a sentence should be included in summary or not). we have developed our text summarization system with three different algorithms and evaluated them with ROUGE. here is…
1
vote
0 answers

Statistical Analysis on Comments and Feedback

I have access to 10,000 comments for a mobile app, and I want to run some interesting statistical analysis on them. What I have done so far: Look at the frequency of each word in all the comments. Then look at a subset of these words that are…
1
vote
0 answers

What is the effect of changing the weight decay and warm-up steps in fine-tuning PEGASUS?

I am fine-tuning PEGASUS model using this script. I am currently using the SAMSum dataset and I have reached a point in which the output doesn't get better. Examples: The Actual Summary Alexis and Carter met tonight. Carter would like to meet…
1
vote
0 answers

Considerations in applying machine learning / AI to "coding" of text responses

A client is looking at applying AI/ML to a dataset of textual responses for the purpose of: a) extracting one or more concepts or meanings from each response, and b) cross-referencing with other responses containing similar concepts or…
1
2