Published on Mon Apr 27 2020

SCDE: Sentence Cloze Dataset with High Quality Distractors From Examinations

Xiang Kong, Varun Gangal, Eduard Hovy

SCDE is a human-created sentence cloze collected from public school English examinations. The task requires a model to fill up multiple blanks in a passage from a shared candidate set. experiments show that there is a significant performance gap between advanced models (72%) and humans (87%)

0
0
0
Abstract

We introduce SCDE, a dataset to evaluate the performance of computational models through sentence prediction. SCDE is a human-created sentence cloze dataset, collected from public school English examinations. Our task requires a model to fill up multiple blanks in a passage from a shared candidate set with distractors designed by English teachers. Experimental results demonstrate that this task requires the use of non-local, discourse-level context beyond the immediate sentence neighborhood. The blanks require joint solving and significantly impair each other's context. Furthermore, through ablations, we show that the distractors are of high quality and make the task more challenging. Our experiments show that there is a significant performance gap between advanced models (72%) and humans (87%), encouraging future models to bridge this gap.

Tue Nov 29 2016
Artificial Intelligence
NewsQA: A Machine Comprehension Dataset
Crowdworkers supply questions and answers based on a set of over 10,000 news articles from CNN. The performance gap between humans and machines (0.198 in F1) demonstrates that significant progress can be made on NewsQA through future research.
0
0
0
Wed May 20 2020
NLP
Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models
ConPONO is an inter-sentence objective for pretraining language models that models discourse. Given an anchor sentence, our model is trained to predict the text k sentences away using a sampled-softmax objective. Our model is the same size as BERT-Base, but
1
0
1
Tue Feb 11 2020
Artificial Intelligence
ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning
New Reading Comprehension dataset requiring logical reasoning (ReClor) extracted from standardized graduate admission examinations. Empirical results show that state-of-the-art models have an outstanding ability to capture biases contained in the dataset with high accuracy.
0
0
0
Tue Apr 07 2020
NLP
A Sentence Cloze Dataset for Chinese Machine Reading Comprehension
0
0
0
Mon Jun 20 2016
Artificial Intelligence
The LAMBADA dataset: Word prediction requiring a broad discourse context
LAMBADA is a dataset to evaluate the capabilities of computational models for text understanding by means of a word prediction task. We show that none of the state-of-the-art language models reaches accuracy above 1% on this benchmark.
0
0
0
Sat Nov 09 2019
Neural Networks
Improving Machine Reading Comprehension via Adversarial Training
Adversarial training (AT) as a regularization method has proved its effectiveness in various tasks, such as image classification and text classification. In this paper, we aim to apply AT on machine reading comprehension (MRC) and study its effects from multiple perspectives.
0
0
0