Published on Sun Apr 28 2019

Improving Image-Based Localization with Deep Learning: The Impact of the Loss Function

Isaac Ronald Ward, M. A. Asim K. Jalwana, Mohammed Bennamoun

This work investigates the impact of the loss function on the performance of Neural Networks. A common technique used when regressing a camera's pose from an image is to formulate the loss as a linear combination of positional and rotational mean squared error. We achieve improvements in the localization accuracy

0
0
0
Abstract

This work investigates the impact of the loss function on the performance of Neural Networks, in the context of a monocular, RGB-only, image localization task. A common technique used when regressing a camera's pose from an image is to formulate the loss as a linear combination of positional and rotational mean squared error (using tuned hyperparameters as coefficients). In this work we observe that changes to rotation and position mutually affect the captured image, and in order to improve performance, a pose regression network's loss function should include a term which combines the error of both of these coupled quantities. Based on task specific observations and experimental tuning, we present said loss term, and create a new model by appending this loss term to the loss function of the pre-existing pose regression network `PoseNet'. We achieve improvements in the localization accuracy of the network for indoor scenes; with decreases of up to 26.7% and 24.0% in the median positional and rotational error respectively, when compared to the default PoseNet.

Sun Apr 02 2017
Computer Vision
Geometric Loss Functions for Camera Pose Regression with Deep Learning
PoseNet is a deep convolutional neural network which learns to regress the 6-DOF camera pose from a single image. It was trained using a naive loss function, with hyper-parameters which require expensive tuning.
0
0
0
Mon Mar 18 2019
Computer Vision
Understanding the Limitations of CNN-based Absolute Camera Pose Regression
Visual localization is a key problem in computer vision and robotics. End-to-end approaches based on convolutional neural networks have become popular. However, they do not achieve the same level of pose accuracy as 3D structure methods.
1
0
1
Tue Mar 16 2021
Computer Vision
Back to the Feature: Learning Robust Camera Localization from Pixels to Pose
PixLoc is a scene-agnostic neural network. It estimates an accurate 6-DoF pose from an image and a 3D model. The code will be publicly available at https://github.com/cvg/pixloc.
4
40
214
Sun Sep 10 2017
Computer Vision
DPC-Net: Deep Pose Correction for Visual Localization
We present a novel method to fuse the power of deep networks with the computational efficiency of geometric and probabilistic localization algorithms. We show how DPC-Net can be used to mitigate the effect of poorly calibrated lens distortion parameters.
0
0
0
Mon Jul 08 2019
Computer Vision
Introduction to Camera Pose Estimation with Deep Learning
Deep learning has transformed the field of computer vision. Recently this idea was applied for regressing the absolute camera pose from an RGB image. Although the resulting accuracy was sub-optimal, this led to a surge of learning-based pose estimation methods.
0
0
0
Wed May 27 2015
Neural Networks
PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization
PoseNet is a real-time monocular six degree of freedom relocalization system. It trains a convolutional neural network to Regress the 6-DOF camera pose from a single RGB image in an end-to-end manner with no need for engineering or optimisation.
0
0
0