Published on Wed Jan 06 2021

One-shot Policy Elicitation via Semantic Reward Manipulation

Aaquib Tabrez, Ryan Leonard, Bradley Hayes

Single-shot Policy Explanation for Augmenting Rewards (SPEAR) is a sequential optimization algorithm. It uses semantic explanations derived from combinations of planning predicates to augment agents' reward functions. We demonstrate that our method makes substantial improvements over the current state-of-the-art.

0
0
0
Abstract

Synchronizing expectations and knowledge about the state of the world is an essential capability for effective collaboration. For robots to effectively collaborate with humans and other autonomous agents, it is critical that they be able to generate intelligible explanations to reconcile differences between their understanding of the world and that of their collaborators. In this work we present Single-shot Policy Explanation for Augmenting Rewards (SPEAR), a novel sequential optimization algorithm that uses semantic explanations derived from combinations of planning predicates to augment agents' reward functions, driving their policies to exhibit more optimal behavior. We provide an experimental validation of our algorithm's policy manipulation capabilities in two practically grounded applications and conclude with a performance analysis of SPEAR on domains of increasingly complex state space and predicate counts. We demonstrate that our method makes substantial improvements over the state-of-the-art in terms of runtime and addressable problem size, enabling an agent to leverage its own expertise to communicate actionable information to improve another's performance.

Sat Jun 13 2020
Artificial Intelligence
Online Bayesian Goal Inference for Boundedly-Rational Planning Agents
0
0
0
Tue Feb 12 2019
Artificial Intelligence
Preferences Implicit in the State of the World
When a robot is deployed in an environment that humans act in, the state of the environment is already optimized for what humans want. We develop an algorithm based on Maximum Causal Entropy IRL and use it to evaluate the idea.
0
0
0
Wed Jun 12 2019
Artificial Intelligence
Search on the Replay Buffer: Bridging Planning and Reinforcement Learning
The history of learning for control has been an exciting back and forth between two broad classes of algorithms: planning and reinforcement learning. Our aim is to decompose the task of reaching a distant goal state into a sequence of easier tasks, each of which corresponds to reaching a subgoal.
0
0
0
Tue Aug 01 2017
Artificial Intelligence
Balancing Explicability and Explanation in Human-Aware Planning
Human aware planning requires an agent to be aware of the intentions,capabilities and mental model of the human in the loop during its decision process. This can involve generating plans that are explicable to a human observer as well as providing explanations. This has led to the notion "multi-model
0
0
0
Wed Jun 23 2021
Artificial Intelligence
Not all users are the same: Providing personalized explanations for sequential decision making problems
There is a growing interest in designing autonomous agents that can work alongside humans. Such agents will undoubtedly be expected to explain their behavior and decisions. Most works tend to focus on methods that generate explanations that are one size fits all.
3
1
1
Sat Feb 03 2018
Artificial Intelligence
Plan Explanations as Model Reconciliation -- An Empirical Study
Algorithms used to explain behavior of autonomous systems have not been evaluated in settings with actual humans in the loop. The applicability of such explanations to human-AI and human-robot interactions remains suspect. We demonstrate to what extent the properties of these algorithms hold as they are evaluated by
0
0
0