PLoS ONE (Jan 2024)
Using machine learning to determine the nationalities of the fastest 100-mile ultra-marathoners and identify top racing events.
Abstract
The present study intended to determine the nationality of the fastest 100-mile ultra-marathoners and the country/events where the fastest 100-mile races are held. A machine learning model based on the XG Boost algorithm was built to predict the running speed from the athlete's age (Age group), gender (Gender), country of origin (Athlete country) and where the race occurred (Event country). Model explainability tools were then used to investigate how each independent variable influenced the predicted running speed. A total of 172,110 race records from 65,392 unique runners from 68 different countries participating in races held in 44 different countries were used for analyses. The model rates Event country (0.53) as the most important predictor (based on data entropy reduction), followed by Athlete country (0.21), Age group (0.14), and Gender (0.13). In terms of participation, the United States leads by far, followed by Great Britain, Canada, South Africa, and Japan, in both athlete and event counts. The fastest 100-mile races are held in Romania, Israel, Switzerland, Finland, Russia, the Netherlands, France, Denmark, Czechia, and Taiwan. The fastest athletes come mostly from Eastern European countries (Lithuania, Latvia, Ukraine, Finland, Russia, Hungary, Slovakia) and also Israel. In contrast, the slowest athletes come from Asian countries like China, Thailand, Vietnam, Indonesia, Malaysia, and Brunei. The difference among male and female predictions is relatively small at about 0.25 km/h. The fastest age group is 25-29 years, but the average speeds of groups 20-24 and 30-34 years are close. Participation, however, peaks for the age group 40-44 years. The model predicts the event location (country of event) as the most important predictor for a fast 100-mile race time. The fastest race courses were occurred in Romania, Israel, Switzerland, Finland, Russia, the Netherlands, France, Denmark, Czechia, and Taiwan. Athletes and coaches can use these findings for their race preparation to find the most appropriate racecourse for a fast 100-mile race time.