Applied Sciences (Aug 2022)

Predicting GPA of University Students with Supervised Regression Machine Learning Models

  • Lukáš Falát,
  • Terézia Piscová

DOI
https://doi.org/10.3390/app12178403
Journal volume & issue
Vol. 12, no. 17
p. 8403

Abstract

Read online

The paper deals with predicting grade point average (GPA) with supervised machine learning models. Based on the literature review, we divide the factors into three groups—psychological, sociological and study factors. Data from the questionnaire are evaluated using statistical analysis. We use confirmatory data analysis, where we compare the answers of men and women, university students coming from grammar schools versus students coming from secondary vocational schools and students divided according to the average grade. The differences between groups are tested with the Shapiro–Wilk and Mann–Whitney U-test. We identify the factors influencing the GPA through correlation analysis, where we use the Pearson test and the ANOVA. Based on the performed analysis, factors that show a statistically significant dependence with the GPA are identified. Subsequently, we implement supervised machine learning models. We create 10 prediction models using linear regression, decision trees and random forest. The models predict the GPA based on independent variables. Based on the MAPE metric on the five validation sets in cross-validation, the best generalization accuracy is achieved by a random forest model—its average MAPE is 11.13%. Therefore, we recommend the use of a random forest as a starting model for modeling student results.

Keywords