Published on Fri Oct 04 2019

Two Stream Networks for Self-Supervised Ego-Motion Estimation

Rares Ambrus, Vitor Guizilini, Jie Li, Sudeep Pillai, Adrien Gaidon

We propose a novel self-supervised two-stream network using RGB and inferred depth information for accurate visual odometry. We introduce a sparsity-inducing data augmentation policy for ego-motion learning that effectively regularizes the data.

0
0
0
Abstract

Learning depth and camera ego-motion from raw unlabeled RGB video streams is seeing exciting progress through self-supervision from strong geometric cues. To leverage not only appearance but also scene geometry, we propose a novel self-supervised two-stream network using RGB and inferred depth information for accurate visual odometry. In addition, we introduce a sparsity-inducing data augmentation policy for ego-motion learning that effectively regularizes the pose network to enable stronger generalization performance. As a result, we show that our proposed two-stream pose network achieves state-of-the-art results among learning-based methods on the KITTI odometry benchmark, and is especially suited for self-supervision at scale. Our experiments on a large-scale urban driving dataset of 1 million frames indicate that the performance of our proposed architecture does indeed scale progressively with more data.

Wed Aug 28 2019
Computer Vision
Unsupervised Scale-consistent Depth and Ego-motion Learning from Monocular Video
Recent work has shown that CNN-based depth and ego-motion estimators can be learned using unlabelled monocular videos. However, the performance is limited by unidentified moving objects that violate the underlying static scene assumption in geometric image reconstruction.
0
0
0
Tue Mar 06 2018
Computer Vision
GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose
GeoNet is a jointly unsupervised learning framework for monocular depth, optical flow and ego-motion estimation from videos. The three components are coupled by the nature of 3D scene geometry, jointly learned by our framework in an end-to-end manner.
0
0
0
Sat Sep 28 2019
Computer Vision
Self-Supervised Learning of Depth and Ego-motion with Differentiable Bundle Adjustment
Most existing learning based methods deal with this task in a supervised manner which require ground-truth data that is expensive to acquire. In this paper we propose to jointly optimize the scene depth and camera motion via a differentiable Bundle Adjustment layer.
0
0
0
Wed Aug 04 2021
Computer Vision
Self-Supervised Learning of Depth and Ego-Motion from Video by Alternative Training and Geometric Constraints from 3D to 2D
Self-supervised learning of depth and ego-motion from unlabeled monocular video has acquired promising results. Most existing methods jointly train the depth and pose networks by photometric consistency of adjacent frames. We aim to improve the depth-pose learning performance without the auxiliary tasks.
0
0
0
Mon Feb 25 2019
Computer Vision
Beyond Photometric Loss for Self-Supervised Ego-Motion Estimation
Accurate relative pose is one of the key components in visual odometry (VO) and simultaneous localization and mapping (SLAM) Recently, the self-supervised learning framework that jointly optimizes the relative pose and target image depth has attracted the attention of the community.
0
0
0
Tue May 25 2021
Computer Vision
Unsupervised Scale-consistent Depth Learning from Video
We propose a monocular depth estimator SC- depth. We show high-quality depth estimation results in both KITTI and NYUv2 datasets. Our monocular-trained deep networks are readily integrated into the ORB-SLAM2 system for more robust and accurate tracking.
0
0
0