An Intrinsically Motivated Robot Explores Non-reward Environments with Output Arbitration

Takuma Seno, Masahiko Osawa, Michita Imai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In real worlds, rewards are easily sparse because the state space is huge. Reinforcement learning agents have to achieve exploration skills to get rewards in such an environment. In that case, curiosity defined as internally generated rewards for state prediction error can encourage agents to explore environments. However, when a robot learns its policy by reinforcement learning, changing outputs of the policy cause jerking because of inertia. Jerking prevents state prediction from convergence, which would make the policy learning unstable. In this paper, we propose Arbitrable Intrinsically Motivated Exploration (AIME), which enables robots to stably learn curiosity-based exploration. AIME uses Accumulator Based Arbitration Model (ABAM) that we previously proposed as an ensemble learning method inspired by prefrontal cortex. ABAM adjusts motor controls to improve stability of reward generation and reinforcement learning. In experiments, we show that a robot can explore a non-reward simulated environment with AIME.

LanguageEnglish
Title of host publicationBiologically Inspired Cognitive Architectures 2018 - Proceedings of the Ninth Annual Meeting of the BICA Society
EditorsAlexei V. Samsonovich
PublisherSpringer Verlag
Pages283-289
Number of pages7
ISBN (Print)9783319993157
DOIs
Publication statusPublished - 2019 Jan 1
Event9th Annual International Conference on Biologically Inspired Cognitive Architectures, BICA 2018 - Prague, Czech Republic
Duration: 2018 Aug 222018 Aug 24

Publication series

NameAdvances in Intelligent Systems and Computing
Volume848
ISSN (Print)2194-5357

Other

Other9th Annual International Conference on Biologically Inspired Cognitive Architectures, BICA 2018
CountryCzech Republic
CityPrague
Period18/8/2218/8/24

Fingerprint

Reinforcement learning
Robots
Experiments

Keywords

  • Deep reinforcement learning
  • Prefrontal cortex
  • Robot navigation

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Computer Science(all)

Cite this

Seno, T., Osawa, M., & Imai, M. (2019). An Intrinsically Motivated Robot Explores Non-reward Environments with Output Arbitration. In A. V. Samsonovich (Ed.), Biologically Inspired Cognitive Architectures 2018 - Proceedings of the Ninth Annual Meeting of the BICA Society (pp. 283-289). (Advances in Intelligent Systems and Computing; Vol. 848). Springer Verlag. https://doi.org/10.1007/978-3-319-99316-4_37

An Intrinsically Motivated Robot Explores Non-reward Environments with Output Arbitration. / Seno, Takuma; Osawa, Masahiko; Imai, Michita.

Biologically Inspired Cognitive Architectures 2018 - Proceedings of the Ninth Annual Meeting of the BICA Society. ed. / Alexei V. Samsonovich. Springer Verlag, 2019. p. 283-289 (Advances in Intelligent Systems and Computing; Vol. 848).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Seno, T, Osawa, M & Imai, M 2019, An Intrinsically Motivated Robot Explores Non-reward Environments with Output Arbitration. in AV Samsonovich (ed.), Biologically Inspired Cognitive Architectures 2018 - Proceedings of the Ninth Annual Meeting of the BICA Society. Advances in Intelligent Systems and Computing, vol. 848, Springer Verlag, pp. 283-289, 9th Annual International Conference on Biologically Inspired Cognitive Architectures, BICA 2018, Prague, Czech Republic, 18/8/22. https://doi.org/10.1007/978-3-319-99316-4_37
Seno T, Osawa M, Imai M. An Intrinsically Motivated Robot Explores Non-reward Environments with Output Arbitration. In Samsonovich AV, editor, Biologically Inspired Cognitive Architectures 2018 - Proceedings of the Ninth Annual Meeting of the BICA Society. Springer Verlag. 2019. p. 283-289. (Advances in Intelligent Systems and Computing). https://doi.org/10.1007/978-3-319-99316-4_37
Seno, Takuma ; Osawa, Masahiko ; Imai, Michita. / An Intrinsically Motivated Robot Explores Non-reward Environments with Output Arbitration. Biologically Inspired Cognitive Architectures 2018 - Proceedings of the Ninth Annual Meeting of the BICA Society. editor / Alexei V. Samsonovich. Springer Verlag, 2019. pp. 283-289 (Advances in Intelligent Systems and Computing).
@inproceedings{a6e9f6bb08614230b3aa28c52e411f67,
title = "An Intrinsically Motivated Robot Explores Non-reward Environments with Output Arbitration",
abstract = "In real worlds, rewards are easily sparse because the state space is huge. Reinforcement learning agents have to achieve exploration skills to get rewards in such an environment. In that case, curiosity defined as internally generated rewards for state prediction error can encourage agents to explore environments. However, when a robot learns its policy by reinforcement learning, changing outputs of the policy cause jerking because of inertia. Jerking prevents state prediction from convergence, which would make the policy learning unstable. In this paper, we propose Arbitrable Intrinsically Motivated Exploration (AIME), which enables robots to stably learn curiosity-based exploration. AIME uses Accumulator Based Arbitration Model (ABAM) that we previously proposed as an ensemble learning method inspired by prefrontal cortex. ABAM adjusts motor controls to improve stability of reward generation and reinforcement learning. In experiments, we show that a robot can explore a non-reward simulated environment with AIME.",
keywords = "Deep reinforcement learning, Prefrontal cortex, Robot navigation",
author = "Takuma Seno and Masahiko Osawa and Michita Imai",
year = "2019",
month = "1",
day = "1",
doi = "10.1007/978-3-319-99316-4_37",
language = "English",
isbn = "9783319993157",
series = "Advances in Intelligent Systems and Computing",
publisher = "Springer Verlag",
pages = "283--289",
editor = "Samsonovich, {Alexei V.}",
booktitle = "Biologically Inspired Cognitive Architectures 2018 - Proceedings of the Ninth Annual Meeting of the BICA Society",
address = "Germany",

}

TY - GEN

T1 - An Intrinsically Motivated Robot Explores Non-reward Environments with Output Arbitration

AU - Seno, Takuma

AU - Osawa, Masahiko

AU - Imai, Michita

PY - 2019/1/1

Y1 - 2019/1/1

N2 - In real worlds, rewards are easily sparse because the state space is huge. Reinforcement learning agents have to achieve exploration skills to get rewards in such an environment. In that case, curiosity defined as internally generated rewards for state prediction error can encourage agents to explore environments. However, when a robot learns its policy by reinforcement learning, changing outputs of the policy cause jerking because of inertia. Jerking prevents state prediction from convergence, which would make the policy learning unstable. In this paper, we propose Arbitrable Intrinsically Motivated Exploration (AIME), which enables robots to stably learn curiosity-based exploration. AIME uses Accumulator Based Arbitration Model (ABAM) that we previously proposed as an ensemble learning method inspired by prefrontal cortex. ABAM adjusts motor controls to improve stability of reward generation and reinforcement learning. In experiments, we show that a robot can explore a non-reward simulated environment with AIME.

AB - In real worlds, rewards are easily sparse because the state space is huge. Reinforcement learning agents have to achieve exploration skills to get rewards in such an environment. In that case, curiosity defined as internally generated rewards for state prediction error can encourage agents to explore environments. However, when a robot learns its policy by reinforcement learning, changing outputs of the policy cause jerking because of inertia. Jerking prevents state prediction from convergence, which would make the policy learning unstable. In this paper, we propose Arbitrable Intrinsically Motivated Exploration (AIME), which enables robots to stably learn curiosity-based exploration. AIME uses Accumulator Based Arbitration Model (ABAM) that we previously proposed as an ensemble learning method inspired by prefrontal cortex. ABAM adjusts motor controls to improve stability of reward generation and reinforcement learning. In experiments, we show that a robot can explore a non-reward simulated environment with AIME.

KW - Deep reinforcement learning

KW - Prefrontal cortex

KW - Robot navigation

UR - http://www.scopus.com/inward/record.url?scp=85053197831&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85053197831&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-99316-4_37

DO - 10.1007/978-3-319-99316-4_37

M3 - Conference contribution

SN - 9783319993157

T3 - Advances in Intelligent Systems and Computing

SP - 283

EP - 289

BT - Biologically Inspired Cognitive Architectures 2018 - Proceedings of the Ninth Annual Meeting of the BICA Society

A2 - Samsonovich, Alexei V.

PB - Springer Verlag

ER -