Published on Sat Aug 12 2017

Deep Steering: Learning End-to-End Driving Model from Spatial and Temporal Visual Cues

Lu Chi, Yadong Mu

In recent years, autonomous driving algorithms using low-cost vehicle-mounted cameras have attracted increasing endeavors from both academia and industry. The proposed model strikes a combination of spatial and temporal cues, investigating instantaneous monocular camera observations and vehicle's historical states. It is based on about 6 hours of human driving data provided by Udacity.

0
0
0
Abstract

In recent years, autonomous driving algorithms using low-cost vehicle-mounted cameras have attracted increasing endeavors from both academia and industry. There are multiple fronts to these endeavors, including object detection on roads, 3-D reconstruction etc., but in this work we focus on a vision-based model that directly maps raw input images to steering angles using deep networks. This represents a nascent research topic in computer vision. The technical contributions of this work are three-fold. First, the model is learned and evaluated on real human driving videos that are time-synchronized with other vehicle sensors. This differs from many prior models trained from synthetic data in racing games. Second, state-of-the-art models, such as PilotNet, mostly predict the wheel angles independently on each video frame, which contradicts common understanding of driving as a stateful process. Instead, our proposed model strikes a combination of spatial and temporal cues, jointly investigating instantaneous monocular camera observations and vehicle's historical states. This is in practice accomplished by inserting carefully-designed recurrent units (e.g., LSTM and Conv-LSTM) at proper network layers. Third, to facilitate the interpretability of the learned model, we utilize a visual back-propagation scheme for discovering and visualizing image regions crucially influencing the final steering prediction. Our experimental study is based on about 6 hours of human driving data provided by Udacity. Comprehensive quantitative evaluations demonstrate the effectiveness and robustness of our model, even under scenarios like drastic lighting changes and abrupt turning. The comparison with other state-of-the-art models clearly reveals its superior performance in predicting the due wheel angle for a self-driving car.

Sat Jan 20 2018
Computer Vision
End-to-end Multi-Modal Multi-Task Vehicle Control for Self-Driving Cars with Visual Perception
Convolutional Neural Networks (CNN) have been successfully applied to autonomous driving tasks, many in an end-to-end manner. We propose a multi-task learning framework to predict the steering angle and speed control simultaneously.
0
0
0
Tue Oct 10 2017
Machine Learning
End-to-End Deep Learning for Steering Autonomous Vehicles Considering Temporal Dependencies
Steering a car through traffic is a complex task that is difficult to cast into algorithms. Researchers turn to training artificial neural networks from front-facing camera data stream along with the associated steering angles. Most existing solutions consider only the visualCamera frames as input.
0
0
0
Fri May 01 2015
Computer Vision
DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving
There are two major paradigms for vision-based autonomous driving: mediated perception approaches and behavior reflex approaches. In this paper, we propose a third paradigm: a direct perception approach to estimate the affordance for driving. We propose to map an input image to a small number of key perception
0
0
0
Wed Dec 11 2019
Machine Learning
Self-Driving Car Steering Angle Prediction Based on Image Recognition
Udacity has release a dataset containing, among other data, a set of images with the steering angle captured during driving. The Udacity challenge aimed topredict steering angle based on only the provided images.
0
0
0
Sun Feb 10 2019
Machine Learning
NeurAll: Towards a Unified Model for Visual Perception in Automated Driving
Convolutional Neural Networks (CNNs) are successfully used for visual perception tasks including object recognition, motion and depth estimation, visual SLAM, etc. These tasks are typically independently explored and modeled. We propose a joint multi-task network design for learning several tasks simultaneously.
0
0
0
Wed Dec 19 2018
Computer Vision
Learning On-Road Visual Control for Self-Driving Vehicles with Auxiliary Tasks
NVIDIA recently proposed an End-to-End algorithm that can directly learn steering commands from raw pixels of a front camera. In this paper, we leverage auxiliary information aside from raw images and design a novel network structure called Auxiliary Task Network.
0
0
0