Model for the Prediction of Dropout in Higher Education in Peru applying Machine Learning Algorithms: Random Forest, Decision Tree, Neural Network and Support Vector Machine

Omar A Jimenez; Ashley Jesús Llontop; Lenis Wong

doi:10.23919/FRUCT58615.2023.10143068

Proceedings of the XXth Conference of Open Innovations Association FRUCT (May 2023)

Model for the Prediction of Dropout in Higher Education in Peru applying Machine Learning Algorithms: Random Forest, Decision Tree, Neural Network and Support Vector Machine

Omar A Jimenez,
Ashley Jesús Llontop,
Lenis Wong

Affiliations

Omar A Jimenez: Universidad Peruana de Ciencias Aplicadas
Ashley Jesús Llontop: Universidad Peruana de Ciencias Aplicadas
Lenis Wong: Universidad Peruana de Ciencias Aplicadas

DOI: https://doi.org/10.23919/FRUCT58615.2023.10143068
Journal volume & issue: Vol. 33, no. 1
pp. 116 – 124

Abstract

Read online

University dropout is a problem that not only affects students, but also families, universities, society, and others. This problem has a global character, so it is common to identify it in different parts of the world. However, there are few solutions that efficiently take advantage of available technology and information. Therefore, this study implements a predictive analysis model to identify students at risk of dropout in Peruvian universities and the variables that influence it. For this purpose, the Cross Industry Standard Process for Data Mining (CRISP - DM) methodology is used to develop the model and four Machine Learning algorithms. The methodology consists of five phases: business understanding, data understanding, data preparation, modeling, and evaluation. The experiment was carried out by conducting a survey to 385 students from different public and private universities in Peru, where cognitive, affective, family environment, pre-university, career and university variables were considered. The results showed that the most influential variables in the prediction of university dropout were "age", "term" and the student's "financing method". We also found that the Random Forest algorithm obtained the best performance, with an AUC of 0.9623 in the prediction of college dropout.

dropout machine learning random forest decision tree neural network support vector machine

Published in Proceedings of the XXth Conference of Open Innovations Association FRUCT

ISSN: 2305-7254 (Print); 2343-0737 (Online)
Publisher: FRUCT
Country of publisher: Finland
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Telecommunication
Website: http://fruct.org/publication

About the journal

Abstract

Keywords