Barfuss, Wolfram; Meylahn, Janusz M.

doi:10.1038/s41598-023-27672-7

Back to matches

Your institution may have access to this item. Find your institution then sign in to continue.

Title: Intrinsic fluctuations of reinforcement learning promote cooperation.
Authors: Barfuss, Wolfram; Meylahn, Janusz M.
Abstract: In this work, we ask for and answer what makes classical temporal-difference reinforcement learning with ϵ -greedy strategies cooperative. Cooperating in social dilemma situations is vital for animals, humans, and machines. While evolutionary theory revealed a range of mechanisms promoting cooperation, the conditions under which agents learn to cooperate are contested. Here, we demonstrate which and how individual elements of the multi-agent learning setting lead to cooperation. We use the iterated Prisoner's dilemma with one-period memory as a testbed. Each of the two learning agents learns a strategy that conditions the following action choices on both agents' action choices of the last round. We find that next to a high caring for future rewards, a low exploration rate, and a small learning rate, it is primarily intrinsic stochastic fluctuations of the reinforcement learning process which double the final rate of cooperation to up to 80%. Thus, inherent noise is not a necessary evil of the iterative learning process. It is a critical asset for the learning of cooperation. However, we also point out the trade-off between a high likelihood of cooperative behavior and achieving this in a reasonable amount of time. Our findings are relevant for purposefully designing cooperative algorithms and regulating undesired collusive effects.
Subjects: REWARD (Psychology); ITERATIVE learning control; REINFORCEMENT learning; LEARNING; COOPERATION; LEAD; LEARNING strategies
Publication: Scientific Reports, 2023, Vol 13, Issue 1, p1
ISSN: 2045-2322
Publication type: Article
DOI: 10.1038/s41598-023-27672-7

We found a match

Intrinsic fluctuations of reinforcement learning promote cooperation.

Barfuss, Wolfram; Meylahn, Janusz M.

REWARD (Psychology); ITERATIVE learning control; REINFORCEMENT learning; LEARNING; COOPERATION; LEAD; LEARNING strategies

Scientific Reports, 2023, Vol 13, Issue 1, p1

2045-2322

Article

10.1038/s41598-023-27672-7