Interpretable Models for Early Prediction of Certification in MOOCs: A Case Study on a MOOC for Smart City Professionals

Georgios Kostopoulos; Theodor Panagiotakopoulos; Sotiris Kotsiantis; Christos Pierrakeas; Achilles Kameas

doi:10.1109/access.2021.3134787

IEEE Access (Jan 2021)

Interpretable Models for Early Prediction of Certification in MOOCs: A Case Study on a MOOC for Smart City Professionals

Georgios Kostopoulos,
Theodor Panagiotakopoulos,
Sotiris Kotsiantis,
Christos Pierrakeas,
Achilles Kameas

Affiliations

Georgios Kostopoulos: ORCiD; School of Science and Technology, Hellenic Open University, Patras, Greece
Theodor Panagiotakopoulos: ORCiD; School of Science and Technology, Hellenic Open University, Patras, Greece
Sotiris Kotsiantis: ORCiD; Department of Mathematics, University of Patras, Patras, Greece
Christos Pierrakeas: Department of Management Science and Technology, University of Patras, Patras, Greece
Achilles Kameas: School of Science and Technology, Hellenic Open University, Patras, Greece

DOI: https://doi.org/10.1109/access.2021.3134787
Journal volume & issue: Vol. 9
pp. 165881 – 165891

Abstract

Read online

Over the last few years, Massive Open Online Courses (MOOCs) have expanded rapidly and tend to become the most typical form of online and distance higher education. As a result, a tremendous amount of data is generated and stored on MOOCs online learning platforms. In any case, this data should be effectively transformed into knowledge, thus providing valuable feedback to learners, and enhancing decision making practices in the educational field. Despite the benefits and learning prospects that MOOCs offer to learners, there is a considerable divergence between enrollment and completion rates. In this context, the main scope of this study is to exploit predictive analytics and explainable artificial intelligence for the early prediction of student certification in a 11-week MOOC for smart cities, namely DevOps. A plethora of Machine Learning models were built employing familiar classification algorithms. The experimental results revealed that the models based on Gradient Boosting, Logistic Regression and Light Gradient Boosted Machine classifiers prevailed in terms of Accuracy, Area Under Curve, Recall, Precision, F1-score, Kappa, and Matthews Correlation Coefficient, getting a predictive accuracy of 94.41% at the end of the second week of the course. Therefore, students who are less likely to obtain a certificate could be envisaged at an early enough stage to provide sufficient support actions and targeted intervention strategies to them. Finally, the performance attributes (i.e., overall grades per week) proved to be the most important predictors for identifying students at risk of failure.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords