An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning

Hirohisa Watanabe, Mineto Tsukada, Hiroki Matsutani

研究成果: Conference contribution

抄録

DQN (Deep Q-Network) is a method to perform Q-learning for reinforcement learning using deep neural networks. DQNS require a large buffer and batch processing for an experience replay and rely on a backpropagation based iterative optimization, making them difficult to be implemented on resource-limited edge devices. In this paper, we propose a lightweight on-device reinforcement learning approach for low-cost FPGA devices. It exploits a recently proposed neural-network based on-device learning approach that does not rely on the backpropagation method but uses OS-ELM (Online Sequential Extreme Learning Machine) based training algorithm. In addition, we propose a combination of L2 regularization and spectral normalization for the on-device reinforcement learning so that output values of the neural network can be fit into a certain range and the reinforcement learning becomes stable. The proposed reinforcement learning approach is designed for PYNQ-Z1 board as a low-cost FPGA platform. The evaluation results using OpenAI Gym demonstrate that the proposed algorithm and its FPGA implementation complete a CartPole-v0 task 29.77x and 89.40x faster than a conventional DQN-based approach when the number of hidden-layer nodes is 64.

本文言語English
ホスト出版物のタイトル2021 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2021 - In conjunction with IEEE IPDPS 2021
出版社Institute of Electrical and Electronics Engineers Inc.
ページ96-103
ページ数8
ISBN(電子版)9781665435772
DOI
出版ステータスPublished - 2021 6
イベント2021 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2021 - Virtual, Portland, United States
継続期間: 2021 5 17 → …

出版物シリーズ

名前2021 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2021 - In conjunction with IEEE IPDPS 2021

Conference

Conference2021 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2021
国/地域United States
CityVirtual, Portland
Period21/5/17 → …

ASJC Scopus subject areas

  • コンピュータ ネットワークおよび通信
  • ハードウェアとアーキテクチャ
  • 情報システム

フィンガープリント

「An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル