Judgment and Decision Making (Mar 2017)

An IRT forecasting model: linking proper scoring rules to item response theory

  • Yuanchao Emily Bo,
  • David V. Budescu,
  • Charles Lewis,
  • Philip E. Tetlock,
  • Barbara Mellers

Journal volume & issue
Vol. 12, no. 2
pp. 90 – 103

Abstract

Read online

This article proposes an Item Response Theoretical (IRT) forecasting model that incorporates proper scoring rules and provides evaluations of forecasters’ expertise in relation to the features of the specific questions they answer. We illustrate the model using geopolitical forecasts obtained by the Good Judgment Project (GJP) (see Mellers, Ungar, Baron, Ramos, Gurcay, Fincher, Scott, Moore, Atanasov, Swift, Murray, Stone and Tetlock, 2014). The expertise estimates from the IRT model, which take into account variation in the difficulty and discrimination power of the events, capture the underlying construct being measured and are highly correlated with the forecasters’ Brier scores. Furthermore, our expertise estimates based on the first three years of the GJP data are better predictors of both the forecasters’ fourth year Brier scores and their activity level than the overall Brier scores obtained and Merkle’s (2016) predictions, based on the same period. Lastly, we discuss the benefits of using event-characteristic information in forecasting.

Keywords