We propose a method to improve the performance of R-learning, a reinforcement learning algorithm, by using multiple state-action value tables. Unlike Q- or Sarsa learning, R-learning learns a policy to maximize undiscounted rewards. Multiple state-action value tables cause substantial explorations as needed and make R-learnings to work well. Efficiency of the proposed method is verified through experiments in simulation environment.
|Number of pages||11|
|Journal||IEEJ Transactions on Electronics, Information and Systems|
|Publication status||Published - 2006|
- Autonomous mobile robot
- Reinforcement learning
ASJC Scopus subject areas
- Electrical and Electronic Engineering