Reinforcement learning produces dominant strategies for the Iterated Prisoner's Dilemma.

Marc Harper; Vincent Knight; Martin Jones; Georgios Koutsovoulos; Nikoleta E Glynatsi; Owen Campbell

doi:10.1371/journal.pone.0188046

PLoS ONE (Jan 2017)

Reinforcement learning produces dominant strategies for the Iterated Prisoner's Dilemma.

Marc Harper,
Vincent Knight,
Martin Jones,
Georgios Koutsovoulos,
Nikoleta E Glynatsi,
Owen Campbell

Affiliations

Marc Harper
Vincent Knight
Martin Jones
Georgios Koutsovoulos
Nikoleta E Glynatsi
Owen Campbell

DOI: https://doi.org/10.1371/journal.pone.0188046
Journal volume & issue: Vol. 12, no. 12
p. e0188046

Abstract

Read online

We present tournament results and several powerful strategies for the Iterated Prisoner's Dilemma created using reinforcement learning techniques (evolutionary and particle swarm algorithms). These strategies are trained to perform well against a corpus of over 170 distinct opponents, including many well-known and classic strategies. All the trained strategies win standard tournaments against the total collection of other opponents. The trained strategies and one particular human made designed strategy are the top performers in noisy tournaments also.

Published in PLoS ONE

ISSN: 1932-6203 (Online)
Publisher: Public Library of Science (PLoS)
Country of publisher: United States
LCC subjects: Medicine; Science
Website: https://journals.plos.org/plosone/

About the journal