Using ChatGPT to annotate a dataset: A case study in intelligent tutoring systems

Aleksandar Vujinović; Nikola Luburić; Jelena Slivka; Aleksandar Kovačević

Machine Learning with Applications (Jun 2024)

Using ChatGPT to annotate a dataset: A case study in intelligent tutoring systems

Aleksandar Vujinović,
Nikola Luburić,
Jelena Slivka,
Aleksandar Kovačević

Affiliations

Aleksandar Vujinović: Faculty of Technical Sciences, University of Novi Sad, Serbia
Nikola Luburić: Faculty of Technical Sciences, University of Novi Sad, Serbia
Jelena Slivka: Faculty of Technical Sciences, University of Novi Sad, Serbia
Aleksandar Kovačević: Corresponding author at: Faculty of Technical Sciences, University of Novi Sad, Trg Dositeja Obradovića 6, 21 000 Novi Sad, Serbia.; Faculty of Technical Sciences, University of Novi Sad, Serbia

Journal volume & issue: Vol. 16
p. 100557

Abstract

Read online

Large language models like ChatGPT can learn in-context (ICL) from examples. Studies showed that, due to ICL, ChatGPT achieves impressive performance in various natural language processing tasks. However, to the best of our knowledge, this is the first study that assesses ChatGPT's effectiveness in annotating a dataset for training instructor models in intelligent tutoring systems (ITSs). The task of an ITS instructor model is to automatically provide effective tutoring instruction given a student's state, mimicking human instructors. These models are typically implemented as hardcoded rules, requiring expertise, and limiting their ability to generalize and personalize instructions. These problems could be mitigated by utilizing machine learning (ML). However, developing ML models requires a large dataset of student states annotated by corresponding tutoring instructions. Using human experts to annotate such a dataset is expensive, time-consuming, and requires pedagogical expertise. Thus, this study explores ChatGPT's potential to act as a pedagogy expert annotator. Using prompt engineering, we created a list of instructions a tutor could recommend to a student. We manually filtered this list and instructed ChatGPT to select the appropriate instruction from the list for the given student's state. We manually analyzed ChatGPT's responses that could be considered incorrectly annotated. Our results indicate that using ChatGPT as an annotator is an effective alternative to human experts. The contributions of our work are (1) a novel dataset annotation methodology for the ITS, (2) a publicly available dataset of student states annotated with tutoring instructions, and (3) a list of possible tutoring instructions.

Published in Machine Learning with Applications

ISSN: 2666-8270 (Online)
Publisher: Elsevier
Country of publisher: United Kingdom
LCC subjects: Science: Science (General): Cybernetics; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.journals.elsevier.com/machine-learning-with-applications

About the journal

Abstract

Keywords