Published on Mon Mar 16 2020

Improving Performance in Reinforcement Learning by Breaking Generalization in Neural Networks

Sina Ghiassian, Banafsheh Rafiee, Yat Long Lo, Adam White

Reinforcement learning systems require good representations to work well. We find that simply re-mapping the input observations to a high-dimensional space improves learning speed and parameter sensitivity. We also show this preprocessing reduces interference in prediction tasks.

0
0
0
Abstract

Reinforcement learning systems require good representations to work well. For decades practical success in reinforcement learning was limited to small domains. Deep reinforcement learning systems, on the other hand, are scalable, not dependent on domain specific prior knowledge and have been successfully used to play Atari, in 3D navigation from pixels, and to control high degree of freedom robots. Unfortunately, the performance of deep reinforcement learning systems is sensitive to hyper-parameter settings and architecture choices. Even well tuned systems exhibit significant instability both within a trial and across experiment replications. In practice, significant expertise and trial and error are usually required to achieve good performance. One potential source of the problem is known as catastrophic interference: when later training decreases performance by overriding previous learning. Interestingly, the powerful generalization that makes Neural Networks (NN) so effective in batch supervised learning might explain the challenges when applying them in reinforcement learning tasks. In this paper, we explore how online NN training and interference interact in reinforcement learning. We find that simply re-mapping the input observations to a high-dimensional space improves learning speed and parameter sensitivity. We also show this preprocessing reduces interference in prediction tasks. More practically, we provide a simple approach to NN training that is easy to implement, and requires little additional computation. We demonstrate that our approach improves performance in both prediction and control with an extensive batch of experiments in classic control domains.

Sun Oct 25 2020
Machine Learning
How to Make Deep RL Work in Practice
In recent years, challenging control problems have become solvable with deep reinforcement learning (RL) To be able to use RL for large-scale real-world applications, a certain degree of reliability is necessary. Reported results of state-of-the-art algorithms are often difficult to reproduce.
0
0
0
Sat Sep 29 2018
Artificial Intelligence
Generalization and Regularization in DQN
Deep reinforcement learning algorithms have shown an impressive ability to learn complex control policies in high-dimensional tasks. Despite the ever-increasing performance on popular benchmarks, policies learned by deep reinforcement learning algorithms can struggle to generalize.
0
0
0
Mon Jul 15 2019
Artificial Intelligence
PPO Dash: Improving Generalization in Deep Reinforcement Learning
Deep reinforcement learning is prone to overfitting. The Obstacle Tower Challenge addresses this by using randomized environments and separate seeds for training, validation, and test runs.
0
0
0
Tue May 12 2020
Artificial Intelligence
Unbiased Deep Reinforcement Learning: A General Training Framework for Existing and Future Algorithms
Deep neural networks have been successfully applied to the domains of reinforcement learning. We propose a novel training framework that is conceptually comprehensible. We employ Monte-carlo sampling to achieve raw data inputs, and train them in batch to achieve Markov decision process sequences.
0
0
0
Wed Sep 16 2020
Artificial Intelligence
Transfer Learning in Deep Reinforcement Learning: A Survey
Reinforcement Learning (RL) is a key technique to address sequential decision-making problems. Recent years have witnessed remarkable progress in RL by virtue of the fast development of deep neural networks. Transfer learning has arisen as an important technique to tackle various challenges faced by RL.
0
0
0
Wed Nov 28 2018
Artificial Intelligence
Unsupervised Control Through Non-Parametric Discriminative Rewards
Unsupervised learning algorithm to train agents to achieve perceptually-specified goals using only a stream of observations and actions. Our agent simultaneously learns a goal-conditioned policy and a goalipientachievement reward function.
0
0
0