Published on Mon Sep 16 2019

Fast transcription of speech in low-resource languages

Mark Hasegawa-Johnson, Camille Goudeseune, Gina-Anne Levow

Software transcribes forty hours of speech in a surprise language. It uses only a few tens of megabytes of noisy text in that language. This software has worked successfully with corpora in Arabic, Assam, Kinyarwanda, Russian, Sinhalese, Swahili,

0
0
0
Abstract

We present software that, in only a few hours, transcribes forty hours of recorded speech in a surprise language, using only a few tens of megabytes of noisy text in that language, and a zero-resource grapheme to phoneme (G2P) table. A pretrained acoustic model maps acoustic features to phonemes; a reversed G2P maps these to graphemes; then a language model maps these to a most-likely grapheme sequence, i.e., a transcription. This software has worked successfully with corpora in Arabic, Assam, Kinyarwanda, Russian, Sinhalese, Swahili, Tagalog, and Tamil.

Tue May 03 2016
Neural Networks
TheanoLM - An Extensible Toolkit for Neural Network Language Modeling
Theano is a tool for training neural network language models (NNLMs), generating text, and scoring sentences. The tool has been written using Python and can easily extend it and tune any aspect of the training process.
0
0
0
Thu Aug 13 2020
Machine Learning
LSTM Acoustic Models Learn to Align and Pronounce with Graphemes
Standard phoneme based systems require handcrafted lexicons that are difficult and expensive to obtain. We propose a training methodology for a grapheme-based speech recognizer that can be trained in a purely data-driven fashion.
0
0
0
Sat May 16 2020
NLP
That Sounds Familiar: an Analysis of Phonetic Representations Transfer Across Languages
Only a handful of the world's languages are abundant with the resources that enable practical applications of speech processing technologies. One of the methods to overcome this problem is to use the resources existing in other languages to train a multilingual automatic speech recognition model.
0
0
0
Fri Jul 27 2018
NLP
A small Griko-Italian speech translation corpus
The corpus consists of 330 utterances (about 20 minutes of speech) which have been transcribed and translated in Italian. The dataset is available online to encourage replicability and diversity in language documentation experiments.
0
0
0
Thu Jun 23 2016
Machine Learning
NN-grams: Unifying neural network and n-gram language models for Speech Recognition
We present NN-grams, a novel, hybrid language model for speech recognition. It combines the memorization capacity and scalability of an n-gram model with the generalization ability of neural networks. We report experiments where the model is trained on 26B words.
0
0
0
Mon Nov 23 2020
NLP
The Zero Resource Speech Benchmark 2021: Metrics and baselines for unsupervised spoken language modeling
We introduce a new unsupervised task, spoken language modeling: the learning of linguistic representations from raw audio signals without any labels. We present the results and analyses of a composite baseline made of the concatenation of three unsuper supervised systems.
0
0
0