抄録
More sensors do not necessarily result in more appropriate state descriptions, so that a mobile robot has to select an appropriate set of sensors besides learning a state-action function in a reinforcement learning environment. We present a multi-armed bandit formulation of the problem and apply it to mobile robot navigation task. We modified the reinforcement comparison method to suit our problem and build a system where the selection of optimal set of sensors and the learning of state-action functions are done simultaneously. Our approach is evaluated on a Khepera robot simulator and the results reveal that our approach works well as an integrated learning system to identify the best set of sensors and reduce learning time.
本文言語 | English |
---|---|
ページ(範囲) | 870-878 |
ページ数 | 9 |
ジャーナル | IEEJ Transactions on Electronics, Information and Systems |
巻 | 125 |
号 | 6 |
DOI | |
出版ステータス | Published - 2005 |
外部発表 | はい |
ASJC Scopus subject areas
- 電子工学および電気工学