Computers and Education: Artificial Intelligence (Dec 2024)
Predicting at-risk students in the early stage of a blended learning course via machine learning using limited data
Abstract
Academic failure is a persistent challenge in education. Despite the limited available data, in this study, we focus on identifying at-risk students in a blended learning (BL) course. Several motivational variables are analyzed to determine their effect on student performance. We use a machine-learning classifier to compare two approaches: 1) The benchmark study, which uses data from the same academic year for both training and testing; and 2) a prospective study, which focuses on extrapolating a model trained on the previous fall semester to predict outcomes for the upcoming fall semester. We categorize the motivational variables, including time management and event-occurrence frequency. The window-expansion strategy is adopted to enhance performance through periodic evaluation to facilitate timely intervention. Consequently, the prospective approach is consistent with the benchmark results, thus demonstrating its generalizability across academic years and student populations. This approach is promising for identifying at-risk students in the BL course early. The Shapley additive explanations (SHAP) method emphasizes the importance of time-management variables, particularly study in time, for identifying at-risk students. This study enhances our understanding of early detection methods for at-risk students. By analyzing time-related behaviors, we establish a robust foundation for data-driven decisions to improve future BL implementations as well as support motivation and self-regulated learning practices.