Published on Thu Oct 31 2019

VASE: Variational Assorted Surprise Exploration for Reinforcement Learning

Haitao Xu, Brendan McCane, Lech Szymanski

Exploration in environments with continuous control and sparse rewardsremains a key challenge in reinforcement learning (RL) We introduce a new definition of surprise and its RL Implementation named Variational Assorted Surprise Exploration (VASE) VASE uses a Bayesian neural network as a model of the environment dynamics.

0
0
0
Abstract

Exploration in environments with continuous control and sparse rewards remains a key challenge in reinforcement learning (RL). Recently, surprise has been used as an intrinsic reward that encourages systematic and efficient exploration. We introduce a new definition of surprise and its RL implementation named Variational Assorted Surprise Exploration (VASE). VASE uses a Bayesian neural network as a model of the environment dynamics and is trained using variational inference, alternately updating the accuracy of the agent's model and policy. Our experiments show that in continuous control sparse reward environments VASE outperforms other surprise-based exploration techniques.

Mon Mar 06 2017
Machine Learning
Surprise-Based Intrinsic Motivation for Deep Reinforcement Learning
Exploration in complex domains is a key challenge in reinforcement learning, especially for tasks with very sparse rewards. We propose to learn a model of the MDP transition probabilities concurrently with the policy, and to form intrinsic rewards that approximate the KL-divergence of the true Transition probabilities from the learned
0
0
0
Tue May 31 2016
Artificial Intelligence
VIME: Variational Information Maximizing Exploration
Scalable and effective exploration remains a key challenge in reinforcement learning (RL) Most contemporary RL relies on simple heuristics such as epsilon-greedy exploration or adding Gaussian noise to the controls. This paper introduces Variational Information Maximizing Exploration (VIME)
0
0
0
Thu Apr 15 2021
Machine Learning
Self-Supervised Exploration via Latent Bayesian Surprise
Training with Reinforcement Learning requires a reward function that is used to guide the agent towards achieving its objective. Generating rewards in self-supervised way, by inspiring the agent with an intrinsic desire to learn and explore the environment, might induce more general behaviours.
0
0
0
Mon Jul 12 2021
Machine Learning
Explore and Control with Adversarial Surprise
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards. We are interested in the problem of learning without rewards, where agents must discover useful behaviors in the absence of task-specific incentives. We propose a new unsupervised RL technique based on an adversarial game.
5
31
112
Wed Dec 11 2019
Machine Learning
SMiRL: Surprise Minimizing Reinforcement Learning in Unstable Environments
Every living organism struggles against disruptive environmental forces. We propose that such a struggle to achieve and preserve order might offer a principle for the emergence of useful behaviors in artificial agents.
3
1
2
Tue Nov 19 2019
Machine Learning
Implicit Generative Modeling for Efficient Exploration
Efficient exploration remains a challenging problem in reinforcement learning. A commonly used approach for exploring environments is to introduce some "intrinsic" reward. In this work, we focus on model uncertainty estimation as an intrinsic reward for efficient exploration.
0
0
0