Published on Sun Jun 07 2020

Maximum Phase Modeling for Sparse Linear Prediction of Speech

Thomas Drugman

Linear prediction (LP) is an ubiquitous analysis method in speech processing. Sparse LP has been shown to be effective in several issues related to speech modeling and coding. The proposed method is shown to significantly increase the sparsity of the LP residual signal.

0
0
0
Abstract

Linear prediction (LP) is an ubiquitous analysis method in speech processing. Various studies have focused on sparse LP algorithms by introducing sparsity constraints into the LP framework. Sparse LP has been shown to be effective in several issues related to speech modeling and coding. However, all existing approaches assume the speech signal to be minimum-phase. Because speech is known to be mixed-phase, the resulting residual signal contains a persistent maximum-phase component. The aim of this paper is to propose a novel technique which incorporates a modeling of the maximum-phase contribution of speech, and can be applied to any filter representation. The proposed method is shown to significantly increase the sparsity of the LP residual signal and to be effective in two illustrative applications: speech polarity detection and excitation modeling.

Sun May 31 2020
NLP
Maximum Voiced Frequency Estimation: Exploiting Amplitude and Phase Spectra
0
0
0
Thu Jan 31 2019
Machine Learning
End-to-End Probabilistic Inference for Nonstationary Audio Analysis
A typical audio signal processing pipeline includes multiple disjoint analysis stages. We show how time-frequency analysis and nonnegative matrix factorisation can be jointly formulated as a Gaussian process model. By doing so, we are able to process audio signals with hundreds of thousands of data points.
0
0
0
Sun May 31 2020
NLP
Residual Excitation Skewness for Automatic Speech Polarity Detection
0
0
0
Wed Apr 25 2018
Machine Learning
Estimation with Low-Rank Time-Frequency Synthesis Models
Many state-of-the-art signal decomposition techniques rely on a low-rankFactorization of a time-frequency (t-f) transform. In this paper we propose a synthesis approach, where t-flow-rankness is imposed to the synthesis coefficients of the data
0
0
0
Fri Sep 15 2017
Machine Learning
Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization
Reducing the interference noise in a monaural noisy speech signal has been a challenging task for many years. We propose a novel speech enhancement method that is based on a Bayesian formulation of NMF (BNMF)
0
0
0
Wed Sep 02 2015
NLP
Enhancement and Recognition of Reverberant and Noisy Speech by Extending Its Coherence
Most speech enhancement algorithms make use of the short-time Fourier transform (STFT) The duration of short STFT frames are inherently limited by the nonstationarity of speech signals. Our approach centers around using a single-channel minimum mean-square error log-spectral amplitude (MMSE-LSA)
0
0
0