Published on Tue Oct 29 2019

Big Bidirectional Insertion Representations for Documents

Lala Li, William Chan

The Insertion Transformer is well suited for long form text generation due to its parallel generation capabilities. However, modeling long sequences is difficult, as there is more ambiguity captured in the attention mechanism. This work proposes the Big Bidirectional Insertion Representations for Documents (Big BIRD)

0
0
0
Abstract

The Insertion Transformer is well suited for long form text generation due to its parallel generation capabilities, requiring generation steps to generate tokens. However, modeling long sequences is difficult, as there is more ambiguity captured in the attention mechanism. This work proposes the Big Bidirectional Insertion Representations for Documents (Big BIRD), an insertion-based model for document-level translation tasks. We scale up the insertion-based models to long form documents. Our key contribution is introducing sentence alignment via sentence-positional embeddings between the source and target document. We show an improvement of +4.3 BLEU on the WMT'19 EnglishGerman document-level translation task compared with the Insertion Transformer baseline.

Fri Apr 10 2020
NLP
Longformer: The Long-Document Transformer
Transformer-based models are unable to process long sequences due to their self-attention operation, which scales quadratically with the sequence length. We introduce the Longformer with an attention Mechanism that scales linearly with sequence length, making it easy to process documents of thousands of tokens.
4
35
182
Thu Aug 19 2021our pick
NLP
Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models
Sentence embeddings are broadly useful for language processing tasks. While T5 achieves impressive performance on language tasks cast as sequence-to-sequence mapping problems, it is unclear how to produce sentences from encoder-decoder models. We investigate three methods for extracting T5 sentences.
4
14
84
Sat Oct 24 2020
NLP
ReadOnce Transformers: Reusable Representations of Text for Transformers
ReadOnce Transformers is an approach to convert a transformer-based model into one that can build an information-capturing, task-independent, andcompressed representation of text. The resulting representation is reusable across different examples and tasks, thereby requiring a document to only be read once.
2
0
0
Fri Feb 12 2021
NLP
On Efficient Training, Controllability and Compositional Generalization of Insertion-based Language Generators
InsNet is an insertion-based sequence model that can be trained as efficiently as traditional transformer decoders while maintaining the same performance. We evaluate InsNet on story generation and CleVR-CoGENT captioning, showing the advantages of InsNet.
0
0
0
Thu Jun 18 2020
Machine Learning
SEAL: Segment-wise Extractive-Abstractive Long-form Text Summarization
The SEAL model achieves state-of-the-art results on existing long-form summarization tasks. It outperforms strong baseline models on a new dataset, Search2Wiki, with much longer input text.
0
0
0
Fri May 01 2020
Artificial Intelligence
POINTER: Constrained Progressive Text Generation via Insertion-based Generative Pre-training
Large-scale pre-trained language models, such as BERT and GPT-2, have achieved excellent performance in language representation learning and free-form text generation. However, these models cannot be directly employed to generate text under specified lexical constraints. To address this challenge, we
0
0
0