Automatic essay scoring for discussion forum in online learning based on semantic and keyword similarities

Bachriah Fatwa Dhini; Abba Suganda Girsang; Unggul Utan Sufandi; Heny Kurniawati

doi:10.1108/AAOUJ-02-2023-0027

AAOU Journal (Dec 2023)

Automatic essay scoring for discussion forum in online learning based on semantic and keyword similarities

Bachriah Fatwa Dhini,
Abba Suganda Girsang,
Unggul Utan Sufandi,
Heny Kurniawati

Affiliations

Bachriah Fatwa Dhini: Department of Multimedia Teaching Material Production Center, Universitas Terbuka, Tangerang Selatan, Indonesia
Abba Suganda Girsang: Computer Science Department, BINUS Graduate Program - Master of Computer Science, Bina Nusantara University, Jakarta, Indonesia
Unggul Utan Sufandi: Faculty of Sains and Technology, Universitas Terbuka, Tangerang Selatan, Indonesia
Heny Kurniawati: Faculty of Sains and Technology, Universitas Terbuka, Tangerang Selatan, Indonesia

DOI: https://doi.org/10.1108/AAOUJ-02-2023-0027
Journal volume & issue: Vol. 18, no. 3
pp. 262 – 278

Abstract

Read online

Purpose – The authors constructed an automatic essay scoring (AES) model in a discussion forum where the result was compared with scores given by human evaluators. This research proposes essay scoring, which is conducted through two parameters, semantic and keyword similarities, using a SentenceTransformers pre-trained model that can construct the highest vector embedding. Combining these models is used to optimize the model with increasing accuracy. Design/methodology/approach – The development of the model in the study is divided into seven stages: (1) data collection, (2) pre-processing data, (3) selected pre-trained SentenceTransformers model, (4) semantic similarity (sentence pair), (5) keyword similarity, (6) calculate final score and (7) evaluating model. Findings – The multilingual paraphrase-multilingual-MiniLM-L12-v2 and distilbert-base-multilingual-cased-v1 models got the highest scores from comparisons of 11 pre-trained multilingual models of SentenceTransformers with Indonesian data (Dhini and Girsang, 2023). Both multilingual models were adopted in this study. A combination of two parameters is obtained by comparing the response of the keyword extraction responses with the rubric keywords. Based on the experimental results, proposing a combination can increase the evaluation results by 0.2. Originality/value – This study uses discussion forum data from the general biology course in online learning at the open university for the 2020.2 and 2021.2 semesters. Forum discussion ratings are still manual. In this survey, the authors created a model that automatically calculates the value of discussion forums, which are essays based on the lecturer's answers moreover rubrics.

Published in AAOU Journal

ISSN: 1858-3431 (Print); 2414-6994 (Online)
Publisher: Emerald Publishing
Country of publisher: United Kingdom
LCC subjects: Education: Theory and practice of education
Website: http://www.emeraldgrouppublishing.com/services/publishing/aaouj/index.htm

About the journal

Abstract

Keywords