Published on Tue Aug 28 2018

What do character-level models learn about morphology? The case of dependency parsing

Clara Vania, Andreas Grivas, Adam Lopez

When parsing morphologically-rich languages with neural models, it is beneficial to model input at the character level. It has been claimed that character-level models learn morphology. We test these claims by comparing character- level models to an oracle with access to explicit morphological analysis on twelve languages.

0
0
0
Abstract

When parsing morphologically-rich languages with neural models, it is beneficial to model input at the character level, and it has been claimed that this is because character-level models learn morphology. We test these claims by comparing character-level models to an oracle with access to explicit morphological analysis on twelve languages with varying morphological typologies. Our results highlight many strengths of character-level models, but also show that they are poor at disambiguating some words, particularly in the face of case syncretism. We then demonstrate that explicitly modeling morphological case improves our best model, showing that character-level models can benefit from targeted forms of explicit morphological modeling.

Fri Aug 31 2018
NLP
Indicatements that character language models learn English morpho-syntactic units and regularities
Character language models have access to surface morphological patterns, but it is not clear whether or how they learn abstract morphological regularities. We instrument a character language model with several probes and find that it can develop a specific unit to identify word boundaries.
0
0
0
Wed Apr 29 2020
Machine Learning
Do Neural Language Models Show Preferences for Syntactic Formalisms?
Study investigates extent to which syntactic structure captured by language models adheres to a surface-syntactic or deep syntactic style. We apply a probe for extracting directed dependency trees to BERT and ELMo models trained on 13 different languages.
0
0
0
Tue Dec 08 2020
NLP
The Role of Interpretable Patterns in Deep Learning for Morphology
0
0
0
Mon May 04 2020
NLP
From SPMRL to NMRL: What Did We Learn (and Unlearn) in a Decade of Parsing Morphologically-Rich Languages (MRLs)?
It has been exactly a decade since the first establishment of SPMRL, a research initiative unifying multiple research efforts to address the peculiar challenges of Statistical Parsing for Morphologically-Rich Languages. Here we reflect on parsing MRLs in that decade, highlight the solutions and lessons learned.
0
0
0
Wed May 26 2021
NLP
Neural Morphology Dataset and Models for Multiple Languages, from the Large to the Endangered
We train neural models for morphological analysis, generation and lemmatization for morphologically rich languages. We present a method for automatically extracting substantially large amount of training data from FSTs for 22 languages, out of which 17 are endangered.
0
0
0
Tue Jun 14 2016
Neural Networks
Word Representation Models for Morphologically Rich Languages in Neural Machine Translation
Dealing with the complex word forms in morphologically rich languages is an open problem in language processing. In contrast to most modern neural systems of translation, which discard the identity for rare words, we propose several architectures for learning word representations.
0
0
0