Tug-of-war model for the two-bandit problem

Nonlocally-correlated parallel exploration via resource conservation

Song Ju Kim, Masashi Aono, Masahiko Hara

Research output: Contribution to journalArticle

30 Citations (Scopus)

Abstract

We propose a model - the " tug-of-war (TOW) model" - to conduct unique parallel searches using many nonlocally-correlated search agents. The model is based on the property of a single-celled amoeba, the true slime mold Physarum, which maintains a constant intracellular resource volume while collecting environmental information by concurrently expanding and shrinking its branches. The conservation law entails a " nonlocal correlation" among the branches, i.e., volume increment in one branch is immediately compensated by volume decrement(s) in the other branch(es). This nonlocal correlation was shown to be useful for decision making in the case of a dilemma. The multi-armed bandit problem is to determine the optimal strategy for maximizing the total reward sum with incompatible demands, by either exploiting the rewards obtained using the already collected information or exploring new information for acquiring higher payoffs involving risks. Our model can efficiently manage the " exploration-exploitation dilemma" and exhibits good performances. The average accuracy rate of our model is higher than those of well-known algorithms such as the modified ε{lunate}-greedy algorithm and modified softmax algorithm, especially, for solving relatively difficult problems. Moreover, our model flexibly adapts to changing environments, a property essential for living organisms surviving in uncertain environments.

Original languageEnglish
Pages (from-to)29-36
Number of pages8
JournalBioSystems
Volume101
Issue number1
DOIs
Publication statusPublished - 2010 Jul
Externally publishedYes

Fingerprint

Bandit Problems
Conservation
Reward
Resources
Branch
Physarum
Myxomycetes
Amoeba
Dilemma
Decision Making
Multi-armed Bandit
Model
Shrinking
Greedy Algorithm
Optimal Strategy
Fungi
Exploitation
Conservation Laws
Increment
Immediately

Keywords

  • Amoeba-based computing
  • Bio-inspired computing
  • Multi-armed bandit problem
  • Reinforcement learning

ASJC Scopus subject areas

  • Statistics and Probability
  • Modelling and Simulation
  • Biochemistry, Genetics and Molecular Biology(all)
  • Applied Mathematics

Cite this

Tug-of-war model for the two-bandit problem : Nonlocally-correlated parallel exploration via resource conservation. / Kim, Song Ju; Aono, Masashi; Hara, Masahiko.

In: BioSystems, Vol. 101, No. 1, 07.2010, p. 29-36.

Research output: Contribution to journalArticle

@article{0c86e5af52d94258beaa4ca51d8c18be,
title = "Tug-of-war model for the two-bandit problem: Nonlocally-correlated parallel exploration via resource conservation",
abstract = "We propose a model - the {"} tug-of-war (TOW) model{"} - to conduct unique parallel searches using many nonlocally-correlated search agents. The model is based on the property of a single-celled amoeba, the true slime mold Physarum, which maintains a constant intracellular resource volume while collecting environmental information by concurrently expanding and shrinking its branches. The conservation law entails a {"} nonlocal correlation{"} among the branches, i.e., volume increment in one branch is immediately compensated by volume decrement(s) in the other branch(es). This nonlocal correlation was shown to be useful for decision making in the case of a dilemma. The multi-armed bandit problem is to determine the optimal strategy for maximizing the total reward sum with incompatible demands, by either exploiting the rewards obtained using the already collected information or exploring new information for acquiring higher payoffs involving risks. Our model can efficiently manage the {"} exploration-exploitation dilemma{"} and exhibits good performances. The average accuracy rate of our model is higher than those of well-known algorithms such as the modified ε{lunate}-greedy algorithm and modified softmax algorithm, especially, for solving relatively difficult problems. Moreover, our model flexibly adapts to changing environments, a property essential for living organisms surviving in uncertain environments.",
keywords = "Amoeba-based computing, Bio-inspired computing, Multi-armed bandit problem, Reinforcement learning",
author = "Kim, {Song Ju} and Masashi Aono and Masahiko Hara",
year = "2010",
month = "7",
doi = "10.1016/j.biosystems.2010.04.002",
language = "English",
volume = "101",
pages = "29--36",
journal = "BioSystems",
issn = "0303-2647",
publisher = "Elsevier Ireland Ltd",
number = "1",

}

TY - JOUR

T1 - Tug-of-war model for the two-bandit problem

T2 - Nonlocally-correlated parallel exploration via resource conservation

AU - Kim, Song Ju

AU - Aono, Masashi

AU - Hara, Masahiko

PY - 2010/7

Y1 - 2010/7

N2 - We propose a model - the " tug-of-war (TOW) model" - to conduct unique parallel searches using many nonlocally-correlated search agents. The model is based on the property of a single-celled amoeba, the true slime mold Physarum, which maintains a constant intracellular resource volume while collecting environmental information by concurrently expanding and shrinking its branches. The conservation law entails a " nonlocal correlation" among the branches, i.e., volume increment in one branch is immediately compensated by volume decrement(s) in the other branch(es). This nonlocal correlation was shown to be useful for decision making in the case of a dilemma. The multi-armed bandit problem is to determine the optimal strategy for maximizing the total reward sum with incompatible demands, by either exploiting the rewards obtained using the already collected information or exploring new information for acquiring higher payoffs involving risks. Our model can efficiently manage the " exploration-exploitation dilemma" and exhibits good performances. The average accuracy rate of our model is higher than those of well-known algorithms such as the modified ε{lunate}-greedy algorithm and modified softmax algorithm, especially, for solving relatively difficult problems. Moreover, our model flexibly adapts to changing environments, a property essential for living organisms surviving in uncertain environments.

AB - We propose a model - the " tug-of-war (TOW) model" - to conduct unique parallel searches using many nonlocally-correlated search agents. The model is based on the property of a single-celled amoeba, the true slime mold Physarum, which maintains a constant intracellular resource volume while collecting environmental information by concurrently expanding and shrinking its branches. The conservation law entails a " nonlocal correlation" among the branches, i.e., volume increment in one branch is immediately compensated by volume decrement(s) in the other branch(es). This nonlocal correlation was shown to be useful for decision making in the case of a dilemma. The multi-armed bandit problem is to determine the optimal strategy for maximizing the total reward sum with incompatible demands, by either exploiting the rewards obtained using the already collected information or exploring new information for acquiring higher payoffs involving risks. Our model can efficiently manage the " exploration-exploitation dilemma" and exhibits good performances. The average accuracy rate of our model is higher than those of well-known algorithms such as the modified ε{lunate}-greedy algorithm and modified softmax algorithm, especially, for solving relatively difficult problems. Moreover, our model flexibly adapts to changing environments, a property essential for living organisms surviving in uncertain environments.

KW - Amoeba-based computing

KW - Bio-inspired computing

KW - Multi-armed bandit problem

KW - Reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=77953609815&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77953609815&partnerID=8YFLogxK

U2 - 10.1016/j.biosystems.2010.04.002

DO - 10.1016/j.biosystems.2010.04.002

M3 - Article

VL - 101

SP - 29

EP - 36

JO - BioSystems

JF - BioSystems

SN - 0303-2647

IS - 1

ER -