Scientific Reports (Sep 2023)

Building a prediction model of college students’ sports behavior based on machine learning method: combining the characteristics of sports learning interest and sports autonomy

  • Haibo Liu,
  • Wenzhi Hou,
  • Iringan Emolyn,
  • Yu Liu

DOI
https://doi.org/10.1038/s41598-023-41496-5
Journal volume & issue
Vol. 13, no. 1
pp. 1 – 16

Abstract

Read online

Abstract College students’ sports behavior is affected by many factors, and sports learning interest and sports autonomy support are potential psychological characteristic factors, which have important influence value on college students’ sports behavior. Machine learning methods are widely used to construct prediction models and show high efficiency. In order to understand the impact of sports learning interest and sports autonomy support on college students’ sports behavior (physical exercise level), the research decided to use the relevant methods of machine learning to build a prediction model, so as to find the internal relationship between them. This paper summarizes the relevant factors that affect college students’ sports behavior (physical exercise level) from two aspects, namely, sports autonomy and sports learning interest, and surveys the demographic and sociological information of college students as a supplement. The research evaluates the level of the prediction model through the construction of the prediction model of the machine learning algorithm and the comparison method, so as to determine the optimal prediction model. The results show that the prediction accuracy of the logistic regression model is 0.7288, the recall rate is 0.7590, and F1 is 0.7397; The prediction accuracy of KNN model is 0.6895, the recall rate is 0.7596, and F1 is 0.7096; The prediction accuracy of naive Bayesian model is 0.7166, the recall rate is 0.6703, and F1 is 0.6864; the prediction accuracy of LDA model is 0.7263, the recall rate is 0.7290, and F1 is 0.7265; The prediction accuracy of the support vector machine model is 0.6563, the recall rate is 0.7700, and F1 is 0.6845; The prediction accuracy of GBDT model is 0.6953, the recall rate is 0.7039, and the F1 score is 0.6989; The prediction accuracy of the decision tree model is 0.6872, the recall rate is 0.6507, and F1 is 0.6672. The logistic regression model performs best in the combination of sports learning interest and motor autonomy support, due to the combination of its linear classification characteristics, better adaptability, high computational efficiency, and better adaptability to feature selection and outlier processing. The conclusion points out that the prediction level of logistic regression model is the highest when combining sports learning interest and sports autonomy support to predict college students’ sports behavior (sports exercise grade), which also provides an important reference for improving college students’ sports behavior (sports exercise grade).