Algorithms (Oct 2021)

Using Decision Trees and Random Forest Algorithms to Predict and Determine Factors Contributing to First-Year University Students’ Learning Performance

  • Thao-Trang Huynh-Cam,
  • Long-Sheng Chen,
  • Huynh Le

DOI
https://doi.org/10.3390/a14110318
Journal volume & issue
Vol. 14, no. 11
p. 318

Abstract

Read online

First-year students’ learning performance has received much attention in educational practice and theory. Previous works used some variables, which should be obtained during the course or in the progress of the semester through questionnaire surveys and interviews, to build prediction models. These models cannot provide enough timely support for the poor performance students, caused by economic factors. Therefore, other variables are needed that allow us to reach prediction results earlier. This study attempts to use family background variables that can be obtained prior to the start of the semester to build learning performance prediction models of freshmen using random forest (RF), C5.0, CART, and multilayer perceptron (MLP) algorithms. The real sample of 2407 freshmen who enrolled in 12 departments of a Taiwan vocational university will be employed. The experimental results showed that CART outperforms C5.0, RF, and MLP algorithms. The most important features were mother’s occupations, department, father’s occupations, main source of living expenses, and admission status. The extracted knowledge rules are expected to be indicators for students’ early performance prediction so that strategic intervention can be planned before students begin the semester.

Keywords