An Intrinsically Motivated Robot Explores Non-reward Environments with Output Arbitration

Takuma Seno, Masahiko Osawa, Michita Imai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In real worlds, rewards are easily sparse because the state space is huge. Reinforcement learning agents have to achieve exploration skills to get rewards in such an environment. In that case, curiosity defined as internally generated rewards for state prediction error can encourage agents to explore environments. However, when a robot learns its policy by reinforcement learning, changing outputs of the policy cause jerking because of inertia. Jerking prevents state prediction from convergence, which would make the policy learning unstable. In this paper, we propose Arbitrable Intrinsically Motivated Exploration (AIME), which enables robots to stably learn curiosity-based exploration. AIME uses Accumulator Based Arbitration Model (ABAM) that we previously proposed as an ensemble learning method inspired by prefrontal cortex. ABAM adjusts motor controls to improve stability of reward generation and reinforcement learning. In experiments, we show that a robot can explore a non-reward simulated environment with AIME.

Original languageEnglish
Title of host publicationBiologically Inspired Cognitive Architectures 2018 - Proceedings of the Ninth Annual Meeting of the BICA Society
EditorsAlexei V. Samsonovich
PublisherSpringer Verlag
Pages283-289
Number of pages7
ISBN (Print)9783319993157
DOIs
Publication statusPublished - 2019
Event9th Annual International Conference on Biologically Inspired Cognitive Architectures, BICA 2018 - Prague, Czech Republic
Duration: 2018 Aug 222018 Aug 24

Publication series

NameAdvances in Intelligent Systems and Computing
Volume848
ISSN (Print)2194-5357

Other

Other9th Annual International Conference on Biologically Inspired Cognitive Architectures, BICA 2018
Country/TerritoryCzech Republic
CityPrague
Period18/8/2218/8/24

Keywords

  • Deep reinforcement learning
  • Prefrontal cortex
  • Robot navigation

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'An Intrinsically Motivated Robot Explores Non-reward Environments with Output Arbitration'. Together they form a unique fingerprint.

Cite this