Algorithm exploitation: Humans are keen to exploit benevolent AI
Jurgis Karpus,
Adrian Krüger,
Julia Tovar Verba,
Bahador Bahrami,
Ophelia Deroy
Affiliations
Jurgis Karpus
Faculty of Philosophy, Philosophy of Science and the Study of Religion, LMU Munich, Geschwister-Scholl-Platz 1, 80539 Munich, Germany; Department of General and Educational Psychology, LMU Munich, Leopoldstraße 13, 80802 Munich, Germany
Adrian Krüger
Faculty of Philosophy, Philosophy of Science and the Study of Religion, LMU Munich, Geschwister-Scholl-Platz 1, 80539 Munich, Germany
Julia Tovar Verba
Faculty of Philosophy, Philosophy of Science and the Study of Religion, LMU Munich, Geschwister-Scholl-Platz 1, 80539 Munich, Germany
Bahador Bahrami
Centre for Adaptive Rationality, Max Planck Institute for Human Development, Lentzeallee 94, 14195 Berlin, Germany; Department of General and Educational Psychology, LMU Munich, Leopoldstraße 13, 80802 Munich, Germany; Department of Psychology, Royal Holloway, University of London, Egham TW20 0EX, UK; Munich Center for Neurosciences – Brain & Mind, Großhaderner Street 2, 82152 Munich, Germany
Ophelia Deroy
Faculty of Philosophy, Philosophy of Science and the Study of Religion, LMU Munich, Geschwister-Scholl-Platz 1, 80539 Munich, Germany; Munich Center for Neurosciences – Brain & Mind, Großhaderner Street 2, 82152 Munich, Germany; Institute of Philosophy, School of Advanced Study, University of London, Senate House, Malet Street, London WC1E 7HU, UK; Corresponding author
Summary: We cooperate with other people despite the risk of being exploited or hurt. If future artificial intelligence (AI) systems are benevolent and cooperative toward us, what will we do in return? Here we show that our cooperative dispositions are weaker when we interact with AI. In nine experiments, humans interacted with either another human or an AI agent in four classic social dilemma economic games and a newly designed game of Reciprocity that we introduce here. Contrary to the hypothesis that people mistrust algorithms, participants trusted their AI partners to be as cooperative as humans. However, they did not return AI's benevolence as much and exploited the AI more than humans. These findings warn that future self-driving cars or co-working robots, whose success depends on humans' returning their cooperativeness, run the risk of being exploited. This vulnerability calls not just for smarter machines but also better human-centered policies.