Published on Wed Dec 16 2020

Multilingual Evidence Retrieval and Fact Verification to Combat Global Disinformation: The Power of Polyglotism

Denisa A. O. Roberts

This article investigates multilingual evidence retrieval and fact verification as a step to combat global disinformation. The goal is building multilingual systems that retrieve in evidence-rich languages.

0
0
0
Abstract

This article investigates multilingual evidence retrieval and fact verification as a step to combat global disinformation, a first effort of this kind, to the best of our knowledge. The goal is building multilingual systems that retrieve in evidence-rich languages to verify claims in evidence-poor languages that are more commonly targeted by disinformation. To this end, our EnmBERT fact verification system shows evidence of transfer learning ability and 400 example mixed English-Romanian dataset is made available for cross-lingual transfer learning evaluation.

Thu Jun 17 2021
NLP
X-FACT: A New Benchmark Dataset for Multilingual Fact Checking
X-FACT is the largest publicly available publicly available multilingual dataset for factual verification of naturally existing real-world claims. The dataset contains short statements in 25 languages.
0
0
0
Tue Oct 13 2020
NLP
X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models
Language models (LMs) have proven surprisingly successful at capturing factual knowledge by completing cloze-style fill-in-the-blank questions. While knowledge is both written and queried in many languages, studies on LMs' factual representation ability have almost invariably been performed on English.
0
0
0
Tue Feb 23 2021
NLP
Factorization of Fact-Checks for Low Resource Indian Languages
0
0
0
Tue Nov 27 2018
NLP
The Fact Extraction and VERification (FEVER) Shared Task
We present the results of the first Fact Extraction and VERification (FEVER) Shared Task. The task challenged participants to classify whether human-written factoid claims could be Supported or Refuted. We received entries from 23 competing teams, 19 of which scored higher than the previously published baseline.
0
0
0
Thu Nov 05 2020
Artificial Intelligence
HoVer: A Dataset for Many-Hop Fact Extraction And Claim Verification
We introduce HoVer (HOppy VERification), a dataset for many-hop evidenceExtraction and fact verification. It challenges models to extract facts from Wikipedia articles that are relevant to a claim and classify whether the claim is Supported or Not-Supported by the facts.
0
0
0
Thu Sep 02 2021
NLP
MultiEURLEX -- A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer
MULTI-EURLEX is a new multilingual dataset for topic-based legal classification. The dataset comprises 65k European Union laws, officially translated in 23 languages. We use the dataset as a testbed for zero-shot cross-lingual transfer.
1
9
41