Decision-making is one of the most important intellectual abilities of not only humans but also other biological organisms, helping their survival. This ability, however, may not be limited to biological systems and may be exhibited by physical systems. Here we demonstrate that any physical object, as long as its volume is conserved when coupled with suitable operations, provides a sophisticated decision-making capability. We consider the multi-armed bandit problem (MBP), the problem of finding, as accurately and quickly as possible, the most profitable option from a set of options that gives stochastic rewards. Efficient MBP solvers are useful for many practical applications, because MBP abstracts a variety of decision-making problems in real-world situations in which an efficient trial-and-error is required. These decisions are made as dictated by a physical object, which is moved in a manner similar to the fluctuations of a rigid body in a tug-of-war (TOW) game. This method, called 'TOW dynamics', exhibits higher efficiency than conventional reinforcement learning algorithms. We show analytical calculations that validate statistical reasons for TOW dynamics to produce the high performance despite its simplicity. These results imply that various physical systems in which some conservation law holds can be used to implement an efficient 'decision-making object'. The proposed scheme will provide a new perspective to open up a physics-based analog computing paradigm and to understanding the biological information-processing principles that exploit their underlying physics.
ASJC Scopus subject areas
- Physics and Astronomy(all)