Published on Mon Mar 04 2019

Sim-to-Real Transfer for Biped Locomotion

Wenhao Yu, Visak CV Kumar, Greg Turk, C. Karen Liu

We present a new approach for transfer of dynamic robot control policies from simulation to real hardware. Key to our approach is to perform system identification of the model parameters of the hardware in two distinct stages. Pre-sysID begins by collecting trajectories from the physical hardware based on a set of generic motion

0
0
0
Abstract

We present a new approach for transfer of dynamic robot control policies such as biped locomotion from simulation to real hardware. Key to our approach is to perform system identification of the model parameters {\mu} of the hardware (e.g. friction, center-of-mass) in two distinct stages, before policy learning (pre-sysID) and after policy learning (post-sysID). Pre-sysID begins by collecting trajectories from the physical hardware based on a set of generic motion sequences. Because the trajectories may not be related to the task of interest, presysID does not attempt to accurately identify the true value of {\mu}, but only to approximate the range of {\mu} to guide the policy learning. Next, a Projected Universal Policy (PUP) is created by simultaneously training a network that projects {\mu} to a low-dimensional latent variable {\eta} and a family of policies that are conditioned on {\eta}. The second round of system identification (post-sysID) is then carried out by deploying the PUP on the robot hardware using task-relevant trajectories. We use Bayesian Optimization to determine the values for {\eta} that optimizes the performance of PUP on the real hardware. We have used this approach to create three successful biped locomotion controllers (walk forward, walk backwards, walk sideways) on the Darwin OP2 robot.

Sat Oct 15 2016
Machine Learning
Sample Efficient Optimization for Learning Controllers for Bipedal Locomotion
Bayesian Optimization is a powerful sample-efficient tool for optimizing non-convex black-box functions. However, its performance can degrade in higher dimensions. We develop a distance metric for bipedal locomotion that enhances the sample-efficiency of Bayesianoptimization.
0
0
0
Tue Oct 09 2018
Machine Learning
Realizing Learned Quadruped Locomotion Behaviors through Kinematic Motion Primitives
The Stoch robot can walk, trot, gallop and bound using kinematic motion primitives (kMPs) These primitives are learned from deep reinforcement learning (D-RL) This type of approach improves the transferability of these gaits to real hardware.
0
0
0
Fri Mar 26 2021
Artificial Intelligence
Reinforcement Learning for Robust Parameterized Locomotion Control of Bipedal Robots
0
0
0
Thu Aug 09 2018
Machine Learning
Augmenting Physical Simulators with Stochastic Neural Networks: Case Study of Planar Pushing and Bouncing
An efficient, generalizable physical simulator with universal uncertainty estimates has wide applications in robot state estimation, planning, and control. We build such a simulator for two scenarios, planar pushing and ball bouncing, by augmenting an analytical rigid-body simulator with a neural network.
0
0
0
Fri Mar 26 2021
Artificial Intelligence
Imitation Learning from MPC for Quadrupedal Multi-Gait Control
We present a learning algorithm for training a single policy that imitates multiple gaits of a walking robot. To achieve this, we use and extend MPC-Net, which is an Imitation Learning approach guided by Model Predictive Control(MPC)
1
0
3
Wed Feb 08 2017
Machine Learning
Preparing for the Unknown: Learning a Universal Policy with Online System Identification
System is made of two components: a Universal Policy (UP) and a function for Online System Identification (OSI) We describe our control policy as universal because it is trained over a wide array of dynamic models.
0
0
0