Published on Wed Nov 13 2019

The phonetic bases of vocal expressed emotion: natural versus acted

Hira Dhamyal, Shahan Ali Memon, Bhiksha Raj, Rita Singh

Can vocal emotions be emulated? This question has been a recurrent concern of the speech community. We use a self-attention based emotion classification model to understand the phonetic bases of emotions. Our tests show significant differences in the manner and choice of viewpoints in acted and natural speech.

0
0
0
Abstract

Can vocal emotions be emulated? This question has been a recurrent concern of the speech community, and has also been vigorously investigated. It has been fueled further by its link to the issue of validity of acted emotion databases. Much of the speech and vocal emotion research has relied on acted emotion databases as valid proxies for studying natural emotions. To create models that generalize to natural settings, it is crucial to work with valid prototypes -- ones that can be assumed to reliably represent natural emotions. More concretely, it is important to study emulated emotions against natural emotions in terms of their physiological, and psychological concomitants. In this paper, we present an on-scale systematic study of the differences between natural and acted vocal emotions. We use a self-attention based emotion classification model to understand the phonetic bases of emotions by discovering the most 'attended' phonemes for each class of emotions. We then compare these attended-phonemes in their importance and distribution across acted and natural classes. Our tests show significant differences in the manner and choice of phonemes in acted and natural speech, concluding moderate to low validity and value in using acted speech databases for emotion classification tasks.

Mon May 31 2021
NLP
Emotional Voice Conversion: Theory, Databases and ESD
The ESD database consists of 350 parallel utterances spoken by 10 native English and 10 native Chinese speakers. More than 29 hours of speech data were recorded in a controlled acoustic environment. The database is suitable for multi-speaker and cross-lingual studies.
1
6
13
Mon Aug 02 2021
NLP
The Role of Phonetic Units in Speech Emotion Recognition
We propose a method for emotion recognition through emotiondependent speech recognition using Wav2vec 2.0. Models of phonemes, broad phonetic classes, and syllables all significantly outperform the utterance model.
2
0
0
Wed Sep 30 2020
Machine Learning
Embedded Emotions -- A Data Driven Approach to Learn Transferable Feature Representations from Raw Speech Input for Emotion Recognition
0
0
0
Fri Jan 31 2020
Machine Learning
Detecting Emotion Primitives from Speech and their use in discerning Categorical Emotions
Emotion plays an essential role in human-to-human communication, enabling us to convey feelings such as happiness, frustration, and sincerity. This work investigated a long-shot-term memory (LSTM) network to detect primitive emotion attributes such as valence, arousal, and dominance.
0
0
0
Fri Nov 29 2019
Machine Learning
Bimodal Speech Emotion Recognition Using Pre-Trained Language Models
0
0
0
Wed Sep 01 2010
Artificial Intelligence
Emotional State Categorization from Speech: Machine vs. Human
This paper presents our investigations on emotional state categorization from speech signals with a psychologically inspired computational model against human performance. We propose a multistage categorization strategy which allows establishing an automatic categorization model flexibly for a given emotional speech categorization task.
0
0
0