Published on Mon Jun 25 2018

Single-channel Speech Dereverberation via Generative Adversarial Training

Chenxing Li, Tieqiang Wang, Shuang Xu, Bo Xu

We propose a single-channel speech dereverberation system based on convolutional, bidirectional long short-term memory and deep feed-forward neural network. The proposed model outperforms weightedprediction error and deep neural network-based systems.

0
0
0
Abstract

In this paper, we propose a single-channel speech dereverberation system (DeReGAT) based on convolutional, bidirectional long short-term memory and deep feed-forward neural network (CBLDNN) with generative adversarial training (GAT). In order to obtain better speech quality instead of only minimizing a mean square error (MSE), GAT is employed to make the dereverberated speech indistinguishable form the clean samples. Besides, our system can deal with wide range reverberation and be well adapted to variant environments. The experimental results show that the proposed model outperforms weighted prediction error (WPE) and deep neural network-based systems. In addition, DeReGAT is extended to an online speech dereverberation scenario, which reports comparable performance with the offline case.

Tue Mar 27 2018
NLP
Investigating Generative Adversarial Networks based Speech Dereverberation for Robust Speech Recognition
We investigate the use of generative adversarial networks (GANs) in speech recognition. LSTM leads a significant improvement as compared with feed-forward DNN and CNN in our dataset.
0
0
0
Wed Jun 10 2020
Machine Learning
HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
Real-world audio recordings are often degraded by factors such as noise and equalization distortion. HiFi-GAN is a deep learning method to transform recorded speech to sound as though it had been recorded in a studio.
0
0
0
Sat Apr 06 2019
Machine Learning
Towards Generalized Speech Enhancement with Generative Adversarial Networks
The speech enhancement task usually consists of removing additive noise or reverberation that partially mask spoken utterances. However, little attention is drawn to other, perhaps more aggressive signal distortions like clipping, chunk elimination, or frequency-band removal. Such distortions can have a large impact not only
0
0
0
Tue Mar 28 2017
Neural Networks
SEGAN: Speech Enhancement Generative Adversarial Network
Current speech enhancement techniques operate on the spectral domain and/or exploit some higher-level feature. In contrast, we operate at the waveform level, training the model end-to-end. We incorporate 28 speakers and 40 different noise conditions into the same model.
0
0
0
Tue Mar 30 2021
Machine Learning
Time-domain Speech Enhancement with Generative Adversarial Learning
0
0
0
Mon Oct 21 2019
Neural Networks
AeGAN: Time-Frequency Speech Denoising via Generative Adversarial Networks
Automatic speech recognition (ASR) systems are of vital importance nowadays. Speech enhancement is a valuable building block in ASR systems and other applications such as hearing aids, smartphones and teleconferencing systems.
0
0
0