Can large language models help predict results from a complex behavioural science study?

Steffen Lippert; Anna Dreber; Magnus Johannesson; Warren Tierney; Wilson Cyrus-Lai; Eric Luis Uhlmann; Thomas Pfeiffer

doi:10.1098/rsos.240682

Royal Society Open Science (Sep 2024)

Can large language models help predict results from a complex behavioural science study?

Steffen Lippert,
Anna Dreber,
Magnus Johannesson,
Warren Tierney,
Wilson Cyrus-Lai,
Eric Luis Uhlmann,
Thomas Pfeiffer

Affiliations

Steffen Lippert: Department of Economics, University of Auckland, Auckland, New Zealand
Anna Dreber: Department of Economics, Stockholm School of Economics, Stockholm, Sweden
Magnus Johannesson: Department of Economics, Stockholm School of Economics, Stockholm, Sweden
Warren Tierney: Organisational Behaviour Area/Marketing Area, INSEAD, Singapore
Wilson Cyrus-Lai: Graduate School of Business, Stanford University, CA, USA
Eric Luis Uhlmann: Organisational Behaviour Area/Marketing Area, INSEAD, Singapore
Thomas Pfeiffer: New Zealand Institute for Advanced Study, Massey University, Auckland, New Zealand

DOI: https://doi.org/10.1098/rsos.240682
Journal volume & issue: Vol. 11, no. 9

Abstract

Read online

We tested whether large language models (LLMs) can help predict results from a complex behavioural science experiment. In study 1, we investigated the performance of the widely used LLMs GPT-3.5 and GPT-4 in forecasting the empirical findings of a large-scale experimental study of emotions, gender, and social perceptions. We found that GPT-4, but not GPT-3.5, matched the performance of a cohort of 119 human experts, with correlations of 0.89 (GPT-4), 0.07 (GPT-3.5) and 0.87 (human experts) between aggregated forecasts and realized effect sizes. In study 2, providing participants from a university subject pool the opportunity to query a GPT-4 powered chatbot significantly increased the accuracy of their forecasts. Results indicate promise for artificial intelligence (AI) to help anticipate—at scale and minimal cost—which claims about human behaviour will find empirical support and which ones will not. Our discussion focuses on avenues for human–AI collaboration in science.

Published in Royal Society Open Science

ISSN: 2054-5703 (Online)
Publisher: The Royal Society
Country of publisher: United Kingdom
LCC subjects: Science
Website: https://royalsocietypublishing.org/journal/rsos

About the journal

Abstract

Keywords