Published on Sun Apr 11 2021

Learn Goal-Conditioned Policy with Intrinsic Motivation for Deep Reinforcement Learning

Jinxin Liu, Donglin Wang, Qiangxing Tian, Zhengyu Chen
0
0
0
Abstract

It is of significance for an agent to learn a widely applicable and general-purpose policy that can achieve diverse goals including images and text descriptions. Considering such perceptually-specific goals, the frontier of deep reinforcement learning research is to learn a goal-conditioned policy without hand-crafted rewards. To learn this kind of policy, recent works usually take as the reward the non-parametric distance to a given goal in an explicit embedding space. From a different viewpoint, we propose a novel unsupervised learning approach named goal-conditioned policy with intrinsic motivation (GPIM), which jointly learns both an abstract-level policy and a goal-conditioned policy. The abstract-level policy is conditioned on a latent variable to optimize a discriminator and discovers diverse states that are further rendered into perceptually-specific goals for the goal-conditioned policy. The learned discriminator serves as an intrinsic reward function for the goal-conditioned policy to imitate the trajectory induced by the abstract-level policy. Experiments on various robotic tasks demonstrate the effectiveness and efficiency of our proposed GPIM method which substantially outperforms prior techniques.

Thu Feb 14 2019
Machine Learning
Unsupervised Visuomotor Control through Distributional Planning Networks
Reinforcement learning has the potential to enable robots to acquire a wide range of skills. In practice, RL usually requires manual, per-task engineering of reward functions. We aim to learn an unsupervised embedding space under which the robot can measure progress towards a goal for itself.
0
0
0
Thu May 13 2021
Artificial Intelligence
MapGo: Model-Assisted Policy Optimization for Goal-Oriented Tasks
In Goal-oriented Reinforcement learning, relabeling the raw goals in past experience to provide agents with hindsight ability is a major solution to the reward sparsity problem. To enhance the diversity of relabeled goals, we develop FGI (Foresight Goal Inference)
0
0
0
Tue Oct 02 2018
Artificial Intelligence
EMI: Exploration with Mutual Information
EMI is an exploration method that does not rely on generative decoding of the full observation. It extracts predictive signals that can be used to guide exploration based on forward prediction. EMI has shown competitive results on challenging locomotion tasks.
0
0
0
Mon Sep 14 2020
Machine Learning
Decoupling Representation Learning from Reinforcement Learning
In an effort to overcome limitations of reward-driven feature learning, we propose decoupling representation learning from policy learning. To this end, we introduce a new task, called Augmented Temporal Contrast (ATC), which trains a convolutional encoder to associate pairs of observations.
0
0
0
Fri Jun 12 2020
Machine Learning
Mutual Information Based Knowledge Transfer Under State-Action Dimension Mismatch
Deep reinforcement learning (RL) algorithms have achieved great success on a wide variety of sequential decision-making tasks. Many of these algorithms suffer from high sample complexity when learning from scratch. Transfer learning is a promising approach to improve the sample complexity in RL.
0
0
0
Fri Nov 22 2019
Artificial Intelligence
A Transfer Learning Method for Goal Recognition Exploiting Cross-Domain Spatial Features
The ability to infer the intentions of others and predict their goals are critical features for intelligent agents. The trend is increasingly focusing on learning to infer intentions directly from data. This paper discusses a novel approach combining few-shot transfer learning with cross-domain features.
0
0
0