Published on Fri Nov 01 2019

DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames

Erik Wijmans, Abhishek Kadian, Ari Morcos, Stefan Lee, Irfan Essa, Devi Parikh, Manolis Savva, Dhruv Batra

Decentralized Distributed Proximal Policy Optimization (DD-PPO) is a method for distributed reinforcement learning in resource-intensive simulated environments. DD- PPO is distributed (uses multiple machines), decentralized(lacks a centralized server), and synchronous (no computation is ever stale), making it conceptually simple to implement.

0
0
0
Abstract

We present Decentralized Distributed Proximal Policy Optimization (DD-PPO), a method for distributed reinforcement learning in resource-intensive simulated environments. DD-PPO is distributed (uses multiple machines), decentralized (lacks a centralized server), and synchronous (no computation is ever stale), making it conceptually simple and easy to implement. In our experiments on training virtual robots to navigate in Habitat-Sim, DD-PPO exhibits near-linear scaling -- achieving a speedup of 107x on 128 GPUs over a serial implementation. We leverage this scaling to train an agent for 2.5 Billion steps of experience (the equivalent of 80 years of human experience) -- over 6 months of GPU-time training in under 3 days of wall-clock time with 64 GPUs. This massive-scale training not only sets the state of art on Habitat Autonomous Navigation Challenge 2019, but essentially solves the task --near-perfect autonomous navigation in an unseen environment without access to a map, directly from an RGB-D camera and a GPS+Compass sensor. Fortuitously, error vs computation exhibits a power-law-like distribution; thus, 90% of peak performance is obtained relatively early (at 100 million steps) and relatively cheaply (under 1 day with 8 GPUs). Finally, we show that the scene understanding and navigation policies learned can be transferred to other navigation tasks -- the analog of ImageNet pre-training + task-specific fine-tuning for embodied AI. Our model outperforms ImageNet pre-trained CNNs on these transfer tasks and can serve as a universal resource (all models and code are publicly available).

Fri Dec 11 2020
Artificial Intelligence
How to Train PointGoal Navigation Agents on a (Sample and Compute) Budget
PointGoal navigation has seen significant recent interest and progress, spurred on by the Habitat platform and associated challenge. We conduct an extensive set of experiments,cumulatively totaling over 50,000 GPU-hours, that let us identify and discuss a number of ostensibly minor but significant design choices.
0
0
0
Fri Mar 12 2021
Artificial Intelligence
Large Batch Simulation for Deep Reinforcement Learning
0
0
0
Tue Aug 31 2021our pick
Machine Learning
WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU
Deep reinforcement learning (RL) is a powerful framework to train decision-making models in complex dynamical environments. WarpDrive is a flexible, lightweight, and easy-to-use open-source RL framework. It implements end- to-end multi-agent RL on a single GPU (Graphics Processing Unit)
4
38
121
Thu Aug 13 2020
Computer Vision
Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by Implicitly Unprojecting to 3D
The goal of perception for autonomous vehicles is to extract semantic representations from multiple sensors and fuse these representations into a single "bird's-eye-view" coordinate frame. The core idea behind our approach is to "lift" each image image individually into a frustum of features for each camera.
0
0
0
Thu Jul 09 2020
Machine Learning
Auxiliary Tasks Speed Up Learning PointGoal Navigation
PointGoal Navigation is an embodied task that requires agents to navigate to a specified point in an unseen environment. Wijmans et al. showed that this task is solvable but their method is computationally prohibitive, requiring 2.5billion frames and 180 GPU-days. We develop a
0
0
0
Mon Sep 03 2018
Machine Learning
Flatland: a Lightweight First-Person 2-D Environment for Reinforcement Learning
Flatland is a simple, lightweight environment for fast prototyping and testing of reinforcement learning agents. It is of lower complexity compared to similar 3D platforms (e.g. DeepMind Lab or VizDoom), but emulates physical properties of the real world.
0
0
0