Mathematics (May 2024)

Novel Feature-Based Difficulty Prediction Method for Mathematics Items Using XGBoost-Based SHAP Model

  • Xifan Yi,
  • Jianing Sun,
  • Xiaopeng Wu

DOI
https://doi.org/10.3390/math12101455
Journal volume & issue
Vol. 12, no. 10
p. 1455

Abstract

Read online

The level of difficulty of mathematical test items is a critical aspect for evaluating test quality and educational outcomes. Accurately predicting item difficulty during test creation is thus significantly important for producing effective test papers. This study used more than ten years of content and score data from China’s Henan Provincial College Entrance Examination in Mathematics as an evaluation criterion for test difficulty, and all data were obtained from the Henan Provincial Department of Education. Based on the framework established by the National Center for Education Statistics (NCES) for test item assessment methodology, this paper proposes a new framework containing eight features considering the uniqueness of mathematics. Next, this paper proposes an XGBoost-based SHAP model for analyzing the difficulty of mathematics tests. By coupling the XGBoost method with the SHAP method, the model not only evaluates the difficulty of mathematics tests but also analyzes the contribution of specific features to item difficulty, thereby increasing transparency and mitigating the “black box” nature of machine learning models. The model has a high prediction accuracy of 0.99 for the training set and 0.806 for the test set. With the model, we found that parameter-level features and reasoning-level features are significant factors influencing the difficulty of subjective items in the exam. In addition, we divided senior secondary mathematics knowledge into nine units based on Chinese curriculum standards and found significant differences in the distribution of the eight features across these different knowledge units, which can help teachers place different emphasis on different units during the teaching process. In summary, our proposed approach significantly improves the accuracy of item difficulty prediction, which is crucial for intelligent educational applications such as knowledge tracking, automatic test item generation, and intelligent paper generation. These results provide tools that are better aligned with and responsive to students’ learning needs, thus effectively informing educational practice.

Keywords