Published on Tue Apr 18 2017

Extractive Summarization: Limits, Compression, Generalized Model and Heuristics

Rakesh Verma, Daniel Lee

We prove empirical limits on the recall (and F1-scores) of extractive summarizers on the DUC datasets under ROUGE evaluation. We present a new model of summarization, which generalizes existing models in the literature. We examine some new and existing single-document

0
0
0
Abstract

Due to its promise to alleviate information overload, text summarization has attracted the attention of many researchers. However, it has remained a serious challenge. Here, we first prove empirical limits on the recall (and F1-scores) of extractive summarizers on the DUC datasets under ROUGE evaluation for both the single-document and multi-document summarization tasks. Next we define the concept of compressibility of a document and present a new model of summarization, which generalizes existing models in the literature and integrates several dimensions of the summarization, viz., abstractive versus extractive, single versus multi-document, and syntactic versus semantic. Finally, we examine some new and existing single-document summarization algorithms in a single framework and compare with state of the art summarizers on DUC data.

Thu Dec 09 2010
Artificial Intelligence
MUDOS-NG: Multi-document Summaries Using N-gram Graphs (Tech Report)
This report describes the MUDOS-NG summarization system, which applies a set of language-independent and generic methods. The proposed methods are mostly combinations of simple operators on a generic character n-gram graph representation of texts.
0
0
0
Fri Jul 10 2015
NLP
Extending a Single-Document Summarizer to Multi-Document: a Hierarchical Approach
The increasing amount of online content motivated the development of multi-document summarization methods. The proposed methods are based on the hierarchical combination of single-document summaries.
0
0
0
Sat Aug 01 2020
NLP
Experiments in Extractive Summarization: Integer Linear Programming, Term/Sentence Scoring, and Title-driven Models
NewsSumm is a framework that includes many existing and new approaches for summarization. NewsSumm's flexibility allows to combine different algorithms and sentence scoring schemes seamlessly.
0
0
0
Thu Feb 11 2021
Machine Learning
Unsupervised Extractive Summarization using Pointwise Mutual Information
Unsupervised approaches to extractive summarization usually rely on a notion of sentence importance. We propose new metrics of relevance and redundancy using pointwise mutual information. We then develop a greedy sentence selection algorithm to maximize relevance and minimize redundancy of extracted sentences.
0
0
0
Tue Mar 29 2016
NLP
Learning-Based Single-Document Summarization with Compression and Anaphoricity Constraints
We present a discriminative model for single-document summarization that combines compression and anaphoricity constraints. Our model selects textual units to include in the summary based on a rich set of sparse features whose weights are learned on a large corpus.
0
0
0
Tue May 05 2020
NLP
Artemis: A Novel Annotation Methodology for Indicative Single Document Summarization
Artemis is a novel hierarchical annotation process that produces indicative summaries for documents from multiple domains. Compared to other annotation methods such as Relative Utility and Pyramid, Artemis is more tractable because judges don't need to look at all the sentences in a document.
0
0
0