Published on Tue Jan 26 2021

A Comparison of Approaches to Document-level Machine Translation

Zhiyi Ma, Sergey Edunov, Michael Auli

This paper presents a systematic comparison of selected approaches from the literature. We find that a simple method based purely on back-translating performs as well as much more elaborate alternatives.

0
0
0
Abstract

Document-level machine translation conditions on surrounding sentences to produce coherent translations. There has been much recent work in this area with the introduction of custom model architectures and decoding algorithms. This paper presents a systematic comparison of selected approaches from the literature on two benchmarks for which document-level phenomena evaluation suites exist. We find that a simple method based purely on back-translating monolingual document-level data performs as well as much more elaborate alternatives, both in terms of document-level metrics as well as human evaluation.

Wed Dec 18 2019
NLP
A Survey on Document-level Neural Machine Translation: Methods and Evaluation
Machine translation is an important task in natural language processing. With the resurgence of neural networks, the translation quality has surpassed that of the translations obtained using statistical techniques for most language-pairs.
0
0
0
Mon Mar 22 2021
Artificial Intelligence
BlonD: An Automatic Evaluation Metric for Document-level MachineTranslation
0
0
0
Thu Jul 22 2021
NLP
To Ship or Not to Ship: An Extensive Evaluation of Automatic Metrics for Machine Translation
Evaluating metrics has been limited to a small collection of human judgements. We investigate which metrics have the highest accuracy to make system-level quality rankings. We show that the sole use of BLEU negatively affected the past development of improved models.
8
33
155
Thu Aug 08 2019
NLP
A Test Suite and Manual Evaluation of Document-Level NMT at WMT19
Test suite for WMT19 aimed at assessing discourse phenomena of machine translation systems. We have manually checked the outputs and identified types of translation errors.
0
0
0
Sun Jun 06 2021
NLP
The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation
The FLORES-101 evaluation benchmark consists of 3001 sentences extracted from English Wikipedia. Thesesentences have been translated in 101 languages by professional translators. The dataset enables better assessment of model quality on the long tail of low-resource languages.
3
28
21
Tue Oct 04 2016
NLP
Is Neural Machine Translation Ready for Deployment? A Case Study on 30 Translation Directions
In this paper we provide the largest published comparison of translation quality for phrase-based SMT and neural machine translation across 30 directions. Experiments are performed for the recently published United Nations Parallel Corpus v1.0 and its large six-way sentence-aligned subcorpus.
0
0
0