Autonomous self-explanation of behavior for interactive reinforcement learning agents

Yosuke Fukuchi, Masahiko Osawa, Hiroshi Yamakawa, Michita Imai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

In cooperation, the workers must know how co-workers behave. However, an agent's policy, which is embedded in a statistical machine learning model, is hard to understand, and requires much time and knowledge to comprehend. Therefore, it is difficult for people to predict the behavior of machine learning robots, which makes Human Robot Cooperation challenging. In this paper, we propose Instruction-based Behavior Explanation (IBE), a method to explain an autonomous agent's future behavior. In IBE, an agent can autonomously acquire the expressions to explain its own behavior by reusing the instructions given by a human expert to accelerate the learning of the agent's policy. IBE also enables a developmental agent, whose policy may change during the cooperation, to explain its own behavior with sufficient time granularity.

Original languageEnglish
Title of host publicationHAI 2017 - Proceedings of the 5th International Conference on Human Agent Interaction
PublisherAssociation for Computing Machinery, Inc
Pages97-101
Number of pages5
ISBN (Electronic)9781450351133
DOIs
Publication statusPublished - 2017 Oct 17
Event5th International Conference on Human Agent Interaction, HAI 2017 - Bielefeld, Germany
Duration: 2017 Oct 172017 Oct 20

Publication series

NameHAI 2017 - Proceedings of the 5th International Conference on Human Agent Interaction

Other

Other5th International Conference on Human Agent Interaction, HAI 2017
CountryGermany
CityBielefeld
Period17/10/1717/10/20

Keywords

  • Human Robot Cooperation
  • Instruction-based Behavior Explanation
  • Interactive Reinforcement Learning

ASJC Scopus subject areas

  • Human-Computer Interaction

Fingerprint Dive into the research topics of 'Autonomous self-explanation of behavior for interactive reinforcement learning agents'. Together they form a unique fingerprint.

Cite this