Published on Mon Jan 08 2018

Multilayered Model of Speech

Andrey Chistyakov

Human speech is the most important part of General Artificial Intelligence and subject of much research. The hypothesis is based on the author's conviction that the brain of any given person has different ability to process and store information.

0
0
0
Abstract

Human speech is the most important part of General Artificial Intelligence and subject of much research. The hypothesis proposed in this article provides explanation of difficulties that modern science tackles in the field of human brain simulation. The hypothesis is based on the author's conviction that the brain of any given person has different ability to process and store information. Therefore, the approaches that are currently used to create General Artificial Intelligence have to be altered.

Tue Nov 06 2018
Machine Learning
Reconstructing Speech Stimuli From Human Auditory Cortex Activity Using a WaveNet Approach
The superior temporal gyrus (STG) region of cortex critically contributes to speech recognition. We show that a proposed WaveNet, with limited available data, is able to reconstruct speech stimuli from STG intracranialrecordings.
0
0
0
Thu Jul 01 2021
NLP
What do End-to-End Speech Models Learn about Speaker, Language and Channel Information? A Layer-wise and Neuron-level Analysis
DNNs are innately opaque and difficult to interpret. We no longer know what features are learned, where they are preserved, and how theyInter-operate. Such an analysis is important for better model understanding and to ensure fairness in ethical decision making.
0
0
0
Mon Jan 11 2016
Neural Networks
Investigating gated recurrent neural networks for speech synthesis
Long short-term memory (LSTM) architecture is attractive because it addresses the vanishing gradient problem in standard RNNs. LSTMs can achieve significantly better performance on SPSS than deep feed-forward neural networks.
0
0
0
Tue Apr 20 2021
NLP
Review of end-to-end speech synthesis technology based on deep learning
The current research focus is the deep learning-based end-to-end speech synthesis technology. It mainly consists of three modules: text front-end, acoustic model, and vocoder. This paper reviews the research status of these three parts, and classifies and compares various methods.
3
16
96
Mon Apr 19 2021
Artificial Intelligence
Interpreting intermediate convolutional layers of CNNs trained on raw speech
0
0
0
Mon Oct 14 2019
Machine Learning
The Theory behind Controllable Expressive Speech Synthesis: a Cross-disciplinary Approach
Expressive speech synthesis is part of the Human-Computer Interaction field. It requires knowledge in areas such as machine learning, signal processing, sociology, psychology. We present a history of the main methods of Text-to-Speech synthesis.
0
0
0