Social Sciences and Humanities Open (Jan 2022)
Predicting the academic performance of middle- and high-school students using machine learning algorithms
Abstract
This research is one of the first to predict the academic performance of middle- and high-school students using Machine Learning Algorithms (MLAs) based on numerous socio-demographic (such as age, gender, obesity, average household income, family size, and marital status of parents), school-related (type of gender education and academic level), and student-related (stress and lifestyle) variables. The Grade Point Average (GPA), which is a reflection of academic performance, is considered to be the model output. Five different MLAs are considered to identify and rank the parameters affecting academic performance: multinomial logistic regression, artificial neural network, random forest, gradient boosting and stacking methods. To evaluate the performance of the MLAs, three metrics are utilized: precision, recall, and F1-score. It is observed that the gradient boosting method outperformed the other techniques by generating superior results, followed by random forest. From the model analysis, it is concluded that a health-conscious lifestyle positively correlates to academic performance, whereas the existence of stress has a negative impact. However, gender is not found to be a significant predictor of a student's academic performance.