Published on Mon Jan 04 2021

Transformers and Transfer Learning for Improving Portuguese Semantic Role Labeling

Sofia Oliveira, Daniel Loureiro, Alípio Jorge
0
0
0
Abstract

Semantic Role Labeling (SRL) is a core Natural Language Processing task. For English, recent methods based on Transformer models have allowed for major improvements over the previous state of the art. However, for low resource languages, and in particular for Portuguese, currently available SRL models are hindered by scarce training data. In this paper, we explore a model architecture with only a pre-trained BERT-based model, a linear layer, softmax and Viterbi decoding. We substantially improve the state of the art performance in Portuguese by over 15. Additionally, we improve SRL results in Portuguese corpora by exploiting cross-lingual transfer learning using multilingual pre-trained models (XLM-R), and transfer learning from dependency parsing in Portuguese. We evaluate the various proposed approaches empirically and as result we present an heuristic that supports the choice of the most appropriate model considering the available resources.

Sun Sep 01 2019
NLP
Syntax-aware Multilingual Semantic Role Labeling
semantic role labeling (SRL) has earned a series of success with even higher performance improvements. Most of theseefforts focus on English, while SRL on multiple languages more than English has received relatively little attention. This paper intends to fill the gap on multilingual SRL.
0
0
0
Thu Aug 29 2019
Machine Learning
Translate and Label! An Encoder-Decoder Approach for Cross-lingual Semantic Role Labeling
We propose a Cross-lingual Encoder-Decoder model that simultaneously translates and generates sentences with Semantic Role Labeling annotations. Unlike annotation projection techniques, our model does not need parallel data during inference time. Our approach can be applied in monolingual, multilingual and
0
0
0
Mon Oct 05 2020
NLP
X-SRL: A Parallel Cross-Lingual Semantic Role Labeling Dataset
Existing multilingual SRL datasets contain disparate annotation styles or domains, hampering generalization in multilingual learning. We propose a method to automatically construct an SRL corpus that is parallel in four languages: English, French, German, Spanish.
0
0
0
Tue Jan 10 2017
Artificial Intelligence
A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling
We introduce a simple and accurate neural model for dependency-based semantic role labeling. Our model predicts predicate-argument dependencies relying on bidirectional LSTM encoder. The semantic role labeler achieves competitive performance on English, even without any kind of syntactic information.
0
0
0
Sat Jan 09 2021
NLP
Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing
Trankit is a light-weight Transformer-based Toolkit for NLP. Built on a state-of-the-art pretrained language model. Trankit significantly outperforms prior multilingual NLP pipelines.
0
0
0
Sun Dec 15 2019
NLP
Multilingual is not enough: BERT for Finnish
Deep learning-based language models have been demonstrated to allow efficient transfer learning for natural language processing. Recent efforts have introduced multilingual models that can be fine-tuned to address tasks in a large number of different languages. However, we still lack a thorough understanding of the capabilities of these models.
0
0
0