Virtual Reality & Intelligent Hardware (Oct 2022)
NPIPVis: A Visualization System Involving NBA Visual Analysis and Integrated Learning Model Prediction
Abstract
Background: Data-driven event analysis has gradually become the backbone of modern competitive sports analysis. Increasingly, competitive sports data analysis tasks use computer vision and machine learning models for intelligent data analysis. Existing sports visualization systems focus on the player–team data visualization, which is not intuitive enough for team season win–loss data and game time–series data visualization, and neglects the prediction of all-star players. Methods: This study used an interactive visualization system, the NPIPVis is designed with parallel aggregated ordered hypergraph dynamic hypergraphs, Calliope visualization data story technology, and iStoryline narrative visualization technology to visualize the regular statistics and game time data of players and teams. NPIPVis includes dynamic hypergraphs of team wins and losses and game plot narrative visualization components. In addition, an integrated learning-based all-star player prediction model SRR-voting, which starts from the existing minority and majority samples, was proposed using synthetic minority oversampling technique and RandomUnderSampler methods to generate and eliminate samples of a certain size to balance the number of all-star and average players in the datasets. A random forest algorithm was introduced next to extract and construct the features of players and combined with the voting integrated model to predict the all-star players, using GridSearchCV, to optimize the hyperparameters of each model in integrated learning and then combined with fivefold cross-validation to improve the generalization ability of the model, and, finally, the Shap model was introduced to enhance the interpretability of the model. Results: The experimental results of comparing the SRR-voting model with six common models show that accuracy and F1-score and recall metrics are significantly improved, which verifies the effectiveness and practicality of the SRR-voting model. Conclusions: This paper combines data visualization and machine learning to design a National Basketball Association data visualization system to help the general audience visualize game data and predict all-star players which can be extended to other sports events or related fields.