Education Sciences (Nov 2019)

Predictive Models for Imbalanced Data: A School Dropout Perspective

  • Thiago M. Barros,
  • Plácido A. Souza Neto,
  • Ivanovitch Silva,
  • Luiz Affonso Guedes

DOI
https://doi.org/10.3390/educsci9040275
Journal volume & issue
Vol. 9, no. 4
p. 275

Abstract

Read online

Predicting school dropout rates is an important issue for the smooth execution of an educational system. This problem is solved by classifying students into two classes using educational activities related statistical datasets. One of the classes must identify the students who have the tendency to persist. The other class must identify the students who have the tendency to dropout. This problem often encounters a phenomenon that masks out the obtained results. This study delves into this phenomenon and provides a reliable educational data mining technique that accurately predicts the dropout rates. In particular, the three data classifying techniques, namely, decision tree, neural networks and Balanced Bagging, are used. The performances of these classifies are tested with and without the use of a downsample, SMOTE and ADASYN data balancing. It is found that among other parameters geometric mean and UAR provides reliable results while predicting the dropout rates using Balanced Bagging classifying techniques.

Keywords