IEEE Access (Jan 2024)

Social Networks and Large Language Models for Division I Basketball Game Winner Prediction

  • Gina Sprint

DOI
https://doi.org/10.1109/ACCESS.2024.3403490
Journal volume & issue
Vol. 12
pp. 84774 – 84784

Abstract

Read online

Sporting event outcome prediction is a well-established and actively researched domain, with a particular focus on college basketball’s March Madness tournament. Researchers, fans, and gamblers alike seek accurate game-level predictions using features such as tournament seeds, season performance, and expert opinions. While machine learning algorithms have been harnessed to build prediction models, no perfect model or human-created bracket has emerged. This paper explores a novel approach to basketball game outcome prediction by utilizing the power of social networks and large language models (LLMs). LLMs are trained to understand and generate text, often eliminating the need for a feature engineering step. Consequently, our method utilizes tweets from official Division I college basketball team Twitter accounts in the days leading up to a game as context for knowledge discovery and winner prediction with LLMs. To do this, we have compiled a comprehensive dataset of over one million tweets from both men’s and women’s teams spanning two consecutive seasons. Instead of relying on traditional numeric features, we employ only tweet text with few-shot/zero-shot learning, thereby offering an emerging social network-based approach for sporting event outcome prediction. Furthermore, using chain of thought prompting we investigate the information in team tweets that are predictive of future game performance.

Keywords