Judgment and Decision Making (Jul 2018)
Testing the ability of the surprisingly popular method to predict NFL games
Abstract
We consider the recently-developed ``surprisingly popular'' method for aggregating decisions across a group of people (Prelec, Seung and McCoy, 2017). The method has shown impressive performance in a range of decision-making situations, but typically for situations in which the correct answer is already established. We consider the ability of the surprisingly popular method to make predictions in a situation where the correct answer does not exist at the time people are asked to make decisions. Specifically, we tested its ability to predict the winners of the 256 US National Football League (NFL) games in the 2017--2018 season. Each of these predictions used participants who self-rated as ``extremely knowledgeable'' about the NFL, drawn from a set of 100 participants recruited through Amazon Mechanical Turk (AMT). We compare the accuracy and calibration of the surprisingly popular method to a variety of alternatives: the mode and confidence-weighted predictions of the expert AMT participants, the individual and aggregated predictions of media experts, and a statistical Elo method based on the performance histories of the NFL teams. Our results are exploratory, and need replication, but we find that the surprisingly popular method outperforms all of these alternatives, and has reasonable calibration properties relating the confidence of its predictions to the accuracy of those predictions.