Applied Sciences (Nov 2021)

An Enhanced Evolutionary Student Performance Prediction Model Using Whale Optimization Algorithm Boosted with Sine-Cosine Mechanism

  • Thaer Thaher,
  • Atef Zaguia,
  • Sana Al Azwari,
  • Majdi Mafarja,
  • Hamouda Chantar,
  • Anmar Abuhamdah,
  • Hamza Turabieh,
  • Seyedali Mirjalili,
  • Alaa Sheta

DOI
https://doi.org/10.3390/app112110237
Journal volume & issue
Vol. 11, no. 21
p. 10237

Abstract

Read online

The students’ performance prediction (SPP) problem is a challenging problem that managers face at any institution. Collecting educational quantitative and qualitative data from many resources such as exam centers, virtual courses, e-learning educational systems, and other resources is not a simple task. Even after collecting data, we might face imbalanced data, missing data, biased data, and different data types such as strings, numbers, and letters. One of the most common challenges in this area is the large number of attributes (features). Determining the highly valuable features is needed to improve the overall students’ performance. This paper proposes an evolutionary-based SPP model utilizing an enhanced form of the Whale Optimization Algorithm (EWOA) as a wrapper feature selection to keep the most informative features and enhance the prediction quality. The proposed EWOA combines the Whale Optimization Algorithm (WOA) with Sine Cosine Algorithm (SCA) and Logistic Chaotic Map (LCM) to improve the overall performance of WOA. The SCA will empower the exploitation process inside WOA and minimize the probability of being stuck in local optima. The main idea is to enhance the worst half of the population in WOA using SCA. Besides, LCM strategy is employed to control the population diversity and improve the exploration process. As such, we handled the imbalanced data using the Adaptive Synthetic (ADASYN) sampling technique and converting WOA to binary variant employing transfer functions (TFs) that belong to different families (S-shaped and V-shaped). Two real educational datasets are used, and five different classifiers are employed: the Decision Trees (DT), k-Nearest Neighbors (k-NN), Naive Bayes (NB), Linear Discriminant Analysis (LDA), and LogitBoost (LB). The obtained results show that the LDA classifier is the most reliable classifier with both datasets. In addition, the proposed EWOA outperforms other methods in the literature as wrapper feature selection with selected transfer functions.

Keywords