Symmetry (Mar 2023)

A Data-Driven Machine Learning Algorithm for Predicting the Outcomes of NBA Games

  • Tomislav Horvat,
  • Josip Job,
  • Robert Logozar,
  • Časlav Livada

DOI
https://doi.org/10.3390/sym15040798
Journal volume & issue
Vol. 15, no. 4
p. 798

Abstract

Read online

We propose a new, data-driven model for the prediction of the outcomes of NBA and possibly other basketball league games by using machine learning methods. The paper starts with a strict mathematical formulation of the basketball statistical quantities and the performance indicators derived from them. The backbone of our model is the extended team efficiency index, which consists of two asymmetric parts: (i) the team efficiency index, generally based on some individual efficiency index—in our case, the NBA player efficiency index, and (ii) the comparing part, in which the observed team is rewarded for every selected feature in which it outperforms its rival. Based on the average of the past extended indices, the predicted extended indices are calculated symmetrically for both teams competing in the observed future game. The relative value of those indices defines the win function, which predicts the game outcome. The prediction model includes the concept of the optimal time window (OTW) for the training data. The training datasets were extracted from maximally four and the testing datasets from maximally two of the five consecutive observed NBA seasons (2013/2014–2017/2018). The model uses basic, derived, advanced, and league-wise basketball game elements as its features, whose preparation and extraction were briefly discussed. The proposed model was tested for several choices of the training and testing sets’ seasons, without and with OTWs. The average obtained prediction accuracy is around 66%, and the maximal obtained accuracy is around 78%. This is satisfactory and in the range of better results in the works of other authors.

Keywords