Machine Learning Group, Département d’Informatique, Université Libre de Bruxelles, 1050 Brussels, Belgium; Artificial Intelligence Lab, Vakgroep Computerwetenschappen, Vrije Universiteit Brussel, 1050 Brussels, Belgium; Center for Human-Compatible AI, University of California, Berkeley, Berkeley, CA 94702, USA; Corresponding author
Marco Saponara
Machine Learning Group, Département d’Informatique, Université Libre de Bruxelles, 1050 Brussels, Belgium
Jorge M. Pacheco
Centro de Biologia Molecular e Ambiental, Universidade do Minho, 4710 - 057 Braga, Portugal; Departamento de Matemática e Aplicações, Universidade do Minho, 4710 - 057 Braga, Portugal; ATP-group, P-2744-016 Porto Salvo, Portugal
Francisco C. Santos
ATP-group, P-2744-016 Porto Salvo, Portugal; INESC-ID and Instituto Superior Técnico, Universidade de Lisboa, IST-Taguspark, 2744-016 Porto Salvo, Portugal
Summary: Even though the Theory of Mind in upper primates has been under investigation for decades, how it may evolve remains an open problem. We propose here an evolutionary game theoretical model where a finite population of individuals may use reasoning strategies to infer a response to the anticipated behavior of others within the context of a sequential dilemma, i.e., the Centipede Game. We show that strategies with bounded reasoning evolve and flourish under natural selection, provided they are allowed to make reasoning mistakes and a temptation for higher future gains is in place. We further show that non-deterministic reasoning co-evolves with an optimism bias that may lead to the selection of new equilibria, closely associated with average behavior observed in experimental data. This work reveals both a novel perspective on the evolution of bounded rationality and a co-evolutionary link between the evolution of Theory of Mind and the emergence of misbeliefs.