Published on Fri May 31 2019

Improving Open Information Extraction via Iterative Rank-Aware Learning

Zhengbao Jiang, Pengcheng Yin, Graham Neubig

Open information extraction (IE) is the task of extracting open-domain assertions from natural language sentences. We found that the extraction likelihood, a confidence measure used by current supervised open IE systems, is not well calibrated. We propose an additional binary classification loss to calibrate the likelihood.

0
0
0
Abstract

Open information extraction (IE) is the task of extracting open-domain assertions from natural language sentences. A key step in open IE is confidence modeling, ranking the extractions based on their estimated quality to adjust precision and recall of extracted assertions. We found that the extraction likelihood, a confidence measure used by current supervised open IE systems, is not well calibrated when comparing the quality of assertions extracted from different sentences. We propose an additional binary classification loss to calibrate the likelihood to make it more globally comparable, and an iterative learning process, where extractions generated by the open IE model are incrementally included as training samples to help the model learn from trial and error. Experiments on OIE2016 demonstrate the effectiveness of our method. Code and data are available at https://github.com/jzbjyb/oie_rank.

Wed Jan 27 2021
Artificial Intelligence
LSOIE: A Large-Scale Dataset for Supervised Open Information Extraction
Open Information Extraction (OIE) systems seek to compress the factual propositions of a sentence into a series of n-ary tuples. Current OIE datasets are limited in both size and diversity. Our LSOIE dataset is 20 times larger than the next largest OIE dataset.
0
0
0
Mon Apr 29 2019
Artificial Intelligence
Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction
In this paper, we consider the problem of open information extraction. We focus on four types of valuable intermediate structures. We propose a unified knowledge expression form, SAOKE, to express them. We train an end-to-end model using the sequenceto-sequence paradigm, called Logician, to transform sentences into facts.
0
0
0
Sun May 17 2020
NLP
IMoJIE: Iterative Memory-Based Joint Open Information Extraction
IMoJIE is an extension to CopyAttention, a sequence generation OpenIE model. It produces the next progressivelyextraction conditioned on all previously extracted tuples. This approach overcomes both shortcomings of CopyAtt attention, resulting in a variable number of extractions per sentence.
0
0
0
Tue Aug 14 2018
NLP
KGCleaner : Identifying and Correcting Errors Produced by Information Extraction Systems
KGCleaner is a framework to identify and correct errors in data produced and delivered by an information extraction system. We introduce amulti-task model that jointly learns to predict if an extracted relation is credible and repair it if not.
0
0
0
Thu Apr 26 2018
NLP
Integrating Local Context and Global Cohesiveness for Open Information Extraction
ReMine is a novel Open IE system, called ReMine, which integrates local context signals and global structural signals in a distant-supervision framework. The new system can be applied to many different domains to facilitate sentence-level tuple extractions using corpus-level statistics.
0
0
0
Wed May 25 2016
Neural Networks
Automatic Open Knowledge Acquisition via Long Short-Term Memory Networks with Feedback Negative Sampling
Previous studies in Open Information Extraction (Open IE) are mainly based on manually defining patterns or automatically learning them from a large corpus. We exploit long short-term memory (LSTM) networks to extract higher-level features along the shortest dependency paths.
0
0
0