Scientific Data (May 2024)

A large-scale multivariate soccer athlete health, performance, and position monitoring dataset

  • Cise Midoglu,
  • Andreas Kjæreng Winther,
  • Matthias Boeker,
  • Susann Dahl Pettersen,
  • Sigurd Pedersen,
  • Nourhan Ragab,
  • Tomas Kupka,
  • Steven A. Hicks,
  • Morten Bredsgaard Randers,
  • Ramesh Jain,
  • Håvard J. Dagenborg,
  • Svein Arne Pettersen,
  • Dag Johansen,
  • Michael A. Riegler,
  • Pål Halvorsen

DOI
https://doi.org/10.1038/s41597-024-03386-x
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 19

Abstract

Read online

Abstract Data analysis for athletic performance optimization and injury prevention is of tremendous interest to sports teams and the scientific community. However, sports data are often sparse and hard to obtain due to legal restrictions, unwillingness to share, and lack of personnel resources to be assigned to the tedious process of data curation. These constraints make it difficult to develop automated systems for analysis, which require large datasets for learning. We therefore present SoccerMon, the largest soccer athlete dataset available today containing both subjective and objective metrics, collected from two different elite women’s soccer teams over two years. Our dataset contains 33,849 subjective reports and 10,075 objective reports, the latter including over six billion GPS position measurements. SoccerMon can not only play a valuable role in developing better analysis and prediction systems for soccer, but also inspire similar data collection activities in other domains which can benefit from subjective athlete reports, GPS position information, and/or time-series data in general.