Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) (Aug 2024)

Comparing Correlation-Based Feature Selection and Symmetrical Uncertainty for Student Dropout Prediction

  • Haryono Setiadi,
  • Indah Paksi Larasati,
  • Esti Suryani,
  • Dewi Wisnu Wardani,
  • Hasan Dwi Cahyono Wardani,
  • Ardhi Wijayanto

DOI
https://doi.org/10.29207/resti.v8i4.5911
Journal volume & issue
Vol. 8, no. 4
pp. 542 – 554

Abstract

Read online

Predicting student dropout is essential for universities dealing with high attrition rates. This study compares two feature selection (FS) methods—correlation-based feature selection (CFS) and symmetrical uncertainty (SU)—in educational data mining for dropout prediction. We evaluate these methods using three classification algorithms: decision tree (DT), support vector machine (SVM), and naive Bayes (NB). Results show that SU outperforms CFS overall, with SVM achieving the highest accuracy (98.16%) when combined with SU Moreover, this study identifies total credits in the fourth semester, cumulative GPA, gender, and student domicile as key predictors of student dropout. This study shows how using feature selection methods can improve the accuracy of predicting student dropout, helping educational institutions retain students better.

Keywords