Published on Fri Aug 02 2019

Improving Deep Reinforcement Learning in Minecraft with Action Advice

Spencer Frazier, Mark Riedl

Interactive machine learning is only practical when the number of human interactions is limited, requiring a balance between human teacher effort and agent performance. We hypothesize that interactive machine learning IML, wherein human teachers play a direct role in training, may alleviate agent susceptibility to aliasing.

0
0
0
Abstract

Training deep reinforcement learning agents complex behaviors in 3D virtual environments requires significant computational resources. This is especially true in environments with high degrees of aliasing, where many states share nearly identical visual features. Minecraft is an exemplar of such an environment. We hypothesize that interactive machine learning IML, wherein human teachers play a direct role in training through demonstrations, critique, or action advice, may alleviate agent susceptibility to aliasing. However, interactive machine learning is only practical when the number of human interactions is limited, requiring a balance between human teacher effort and agent performance. We conduct experiments with two reinforcement learning algorithms which enable human teachers to give action advice, Feedback Arbitration and Newtonian Action Advice, under visual aliasing conditions. To assess potential cognitive load per advice type, we vary the accuracy and frequency of various human action advice techniques. Training efficiency, robustness against infrequent and inaccurate advisor input, and sensitivity to aliasing are examined.

Tue Feb 12 2019
Machine Learning
Deep Reinforcement Learning from Policy-Dependent Human Feedback
Intelligent agents must be able to learn complex behaviors as specified by (non-expert) human users. They will need to learn these behaviors within a reasonable amount of time while efficiently leveraging the sparse feedback a human trainer is capable of providing.
0
0
0
Fri Feb 15 2019
Artificial Intelligence
Neural-encoding Human Experts' Domain Knowledge to Warm Start Reinforcement Learning
Deep reinforcement learning has been successful in a variety of tasks, such as game playing and robotic manipulation. Our approach permits the encoding of domain knowledge directly into a neural decision tree. We empirically validate our approach on two OpenAI Gym tasks and two modified StarCraft 2 tasks.
0
0
0
Fri Sep 27 2019
Artificial Intelligence
Playing Atari Ball Games with Hierarchical Reinforcement Learning
Human beings are particularly good at reasoning and inference from just a few examples. We argue that in solving real-world problems, especially when the task is designed by humans, there are typically instructions from user manuals and/or human experts. These instructions have tremendous value in designing a reinforcement learning system.
0
0
0
Tue Sep 12 2017
Artificial Intelligence
Explore, Exploit or Listen: Combining Human Feedback and Policy Model to Speed up Deep Reinforcement Learning in 3D Worlds
We describe a method to use discrete human feedback to enhance the performance of deep learning agents in virtual three-dimensional environments. This enables deep reinforcement learning algorithms to determine the most appropriate time to listen to the human feedback, exploit the current policy model, or explore the agent's environment.
0
0
0
Mon Mar 16 2020
Neural Networks
Improving Performance in Reinforcement Learning by Breaking Generalization in Neural Networks
Reinforcement learning systems require good representations to work well. We find that simply re-mapping the input observations to a high-dimensional space improves learning speed and parameter sensitivity. We also show this preprocessing reduces interference in prediction tasks.
0
0
0
Tue May 23 2017
Artificial Intelligence
Thinking Fast and Slow with Deep Learning and Tree Search
Expert Iteration(ExIt) is a novel reinforcement learning algorithm. It decomposes the problem into separate planning and generalisation tasks. Planning new policies is performed by tree search, while a deep neural network generalises those plans.
1
0
0