Heliyon (Jun 2024)

Predictive modelling of student dropout risk: Practical insights from a South Korean distance university

  • Eui-Yeong Seo,
  • Jaemo Yang,
  • Ji-Eun Lee,
  • Geunju So

Journal volume & issue
Vol. 10, no. 11
p. e30960

Abstract

Read online

Distance education supports lifelong learning and empowers individuals in rapidly changing societal conditions, yet it encounters high dropout rates due to a range of individual and societal obstacles. This study addresses the challenge of creating a practical prediction model by analyzing extensive real-world time-point data from a well-established online university in Seoul. Covering 144,540 instances from 2018 to 2022, the study integrates diverse datasets to compare the accuracy of models based on longitudinal, semester-wise, and gender-specific datasets. The demographic, academic, and online metrics identified significant dropout indicators, including age (particularly when binned), residential area, specific occupations, GPA, and LMS log metrics, using a stepwise backward elimination process. The study revealed that, despite societal changes, recent data from the last four semesters can be effectively used for stable prediction training. Gender-based analysis showed different factors influencing dropout risk for males and females. The Light Gradient Boosting Machine (LGBM) algorithm excelled in prediction accuracy, with the ROC-AUC metric affirming its superiority. However, logistic regression also showed its competitive performance and offered in-depth interpretation. In South Korea's distinct educational setting, merging advanced algorithms like LGBM with the interpretive strength of logistic regression is key for effective student support strategies.

Keywords