Computers and Education: Artificial Intelligence (Jun 2024)

DNA of learning behaviors: A novel approach of learning performance prediction by NLP

  • Chien-Chang Lin,
  • Eddie S.J. Cheng,
  • Anna Y.Q. Huang,
  • Stephen J.H. Yang

Journal volume & issue
Vol. 6
p. 100227

Abstract

Read online

In recent years, the field of learning analytics has gained significant attention as educators and researchers seek to understand and optimize the learning process in online learning systems. This paper presents a novel methodology for predicting learning performance in online learning systems by leveraging natural language processing (NLP) and embedding techniques. The study focuses on two online learning systems, namely BookRoll and Viscode, and aims to analyze the learning behaviors of students using system logs extracted from the databases. The logs are converted into semester and daily learning description documents to capture the daily activities and progress of the learners. To transform the natural language data into numerical representations, the transformer-based BERT model, Google Gemini, and OpenAI large text embedding methodology are employed to generate embeddings for the learning descriptions. Subsequently, k-means clustering is applied to identify distinct learning behaviors exhibited by students. These clusters are labeled with numbers, and the daily learning descriptions are combined into a sequence, referred to as the DNA of learning behaviors. By utilizing this DNA representation, the learning status of students is effectively captured, and a machine learning model is trained to predict learning performance. The experimental results demonstrate the efficacy of the proposed methodology in achieving highly convincing predictions. The contributions of this research lie in the adoption of a unique approach, integrating NLP methodologies and embeddings techniques, to enable accurate learning performance prediction in online learning systems.

Keywords