Published on Mon Oct 05 2020

Modulated Fusion using Transformer for Linguistic-Acoustic Emotion Recognition

Jean-Benoit Delbrouck, Noé Tits, Stéphane Dupont

This paper aims to bring a new lightweight yet powerful solution for the task of Emotion Recognition and Sentiment Analysis. The experiments can be directly replicated and the code is fully open for future researches.

0
0
0
Abstract

This paper aims to bring a new lightweight yet powerful solution for the task of Emotion Recognition and Sentiment Analysis. Our motivation is to propose two architectures based on Transformers and modulation that combine the linguistic and acoustic inputs from a wide range of datasets to challenge, and sometimes surpass, the state-of-the-art in the field. To demonstrate the efficiency of our models, we carefully evaluate their performances on the IEMOCAP, MOSI, MOSEI and MELD dataset. The experiments can be directly replicated and the code is fully open for future researches.

Mon Jun 29 2020
Machine Learning
A Transformer-based joint-encoding for Emotion Recognition and Sentiment Analysis
The proposed solution has also been submitted to the ACL20: Second Grand-Challenge on Multimodal Language. The code to replicate the presented experiments is open-source.
0
0
0
Tue Aug 31 2021
Machine Learning
Improving Multimodal fusion via Mutual Dependency Maximisation
Multimodal sentiment analysis is a trending area of research. We propose a set of new penalties that measure the dependency between modalities. We demonstrate that our new penalties lead to a consistent improvement (up to on accuracy)
4
0
0
Mon Sep 07 2020
NLP
TransModality: An End2End Fusion Method with Transformer for Multimodal Sentiment Analysis
Multimodal sentiment analysis is an important research area. A variety of fusion methods have been proposed, but few adopt end-to-end translation models to mine the subtle correlation between modalities. We propose a new fusion method, TransModality, to address the task.
0
0
0
Sat Jun 16 2018
Computer Vision
Multimodal Sentiment Analysis using Hierarchical Fusion with Context Modeling
Multimodal sentiment analysis is a very actively growing field of research. A promising area of opportunity in this field is to improve the multimodal fusion mechanism. We present a novel feature fusion strategy that proceeds in a novel way.
0
0
0
Sat Jul 06 2019
Computer Vision
Multimodal Fusion with Deep Neural Networks for Audio-Video Emotion Recognition
This paper presents a novel deep neural network (DNN) for multimodal fusion of audio, video and text modalities for emotion recognition. The proposed DNN architecture has independent and shared layers which aim to learn the best combined representation to achieve the best prediction.
0
0
0
Wed Apr 17 2019
NLP
Complementary Fusion of Multi-Features and Multi-Modalities in Sentiment Analysis
Sentiment analysis, mostly based on text, has been rapidly developing in the last decade. We propose a novel fusion strategy including both multi-feature fusion and multi-modality fusion to improve the accuracy of audio-text sentiment analysis.
0
0
0