Published on Tue Jul 02 2019

Kite: Automatic speech recognition for unmanned aerial vehicles

Dan Oneata, Horia Cucu

This paper addresses the problem of building a speech recognition system for unmanned aerial vehicles. The task of creating voice interfaces for them is largely unaddressed. We find that recurrent neural networks are a solution to both tasks.

0
0
0
Abstract

This paper addresses the problem of building a speech recognition system attuned to the control of unmanned aerial vehicles (UAVs). Even though UAVs are becoming widespread, the task of creating voice interfaces for them is largely unaddressed. To this end, we introduce a multi-modal evaluation dataset for UAV control, consisting of spoken commands and associated images, which represent the visual context of what the UAV "sees" when the pilot utters the command. We provide baseline results and address two research directions: (i) how robust the language models are, given an incomplete list of commands at train time; (ii) how to incorporate visual information in the language model. We find that recurrent neural networks (RNNs) are a solution to both tasks: they can be successfully adapted using a small number of commands and they can be extended to use visual cues. Our results show that the image-based RNN outperforms its text-only counterpart even if the command-image training associations are automatically generated and inherently imperfect. The dataset and our code are available at http://kite.speed.pub.ro.

Mon Sep 07 2020
Machine Learning
Deep Learning and Reinforcement Learning for Autonomous Unmanned Aerial Systems: Roadmap for Theory to Deployment
Unmanned Aerial Systems (UAS) are being increasingly deployed for commercial,civilian, and military applications. The current UAS state-of-the-art still depends on a remote human controller with robust wireless links. Enabling autonomy andintelligence to the UAS will help overcome this hurdle and expand its use.
0
0
0
Sat Mar 10 2018
Machine Learning
Speech Recognition: Keyword Spotting Through Image Recognition
The problem of identifying voice commands has always been a challenge. We will compare the efficacies of several neural network architectures for the speech recognition problem. The result is a demonstration of how to convert a problem in audio recognition to the better-studied domain of image classification.
0
0
0
Mon Oct 23 2017
Artificial Intelligence
Listening to the World Improves Speech Command Recognition
We study transfer learning in convolutional network architectures applied to recognizing audio. Our key finding is that it is possible to transfer representations from an unrelated task like environmental sound classification to a voice-focused task like speech command recognition. We also investigate the effect of increased model capacity.
0
0
0
Wed Sep 09 2020
Artificial Intelligence
Unmanned Aerial Vehicle Control Through Domain-based Automatic Speech Recognition
Drones are becoming a part of our lives and reaching out to many areas of society, including the industrialized world. However, control through such devices is not a natural, human-like communication interface. In this work, we present a domain-based speech-to-action recognition architecture to effectively control an unmanned aerial vehicle such as a drone.
0
0
0
Fri Nov 17 2017
Computer Vision
Fast Recurrent Fully Convolutional Networks for Direct Perception in Autonomous Driving
Deep convolutional neural networks (CNNs) have been shown to perform extremely well at a variety of tasks including subtasks of autonomous driving. However, networks designed for these tasks typically require vast quantities of training data and long training periods to converge. We propose and compare three small and inexpensive
0
0
0
Wed Feb 10 2021
Neural Networks
Fast Classification Learning with Neural Networks and Conceptors for Speech Recognition and Car Driving Maneuvers
Recurrent neural networks are a powerful means in diverse applications. We show that, together with so-called conceptors, they also allow fast learning. In addition, a relatively small number of examples suffices to train neural networks with high accuracy.
0
0
0