Computers (Mar 2025)

Multifaceted Assessment of Responsible Use and Bias in Language Models for Education

  • Ishrat Ahmed,
  • Wenxing Liu,
  • Rod D. Roscoe,
  • Elizabeth Reilley,
  • Danielle S. McNamara

DOI
https://doi.org/10.3390/computers14030100
Journal volume & issue
Vol. 14, no. 3
p. 100

Abstract

Read online

Large language models (LLMs) are increasingly being utilized to develop tools and services in various domains, including education. However, due to the nature of the training data, these models are susceptible to inherent social or cognitive biases, which can influence their outputs. Furthermore, their handling of critical topics, such as privacy and sensitive questions, is essential for responsible deployment. This study proposes a framework for the automatic detection of biases and violations of responsible use using a synthetic question-based dataset mimicking student–chatbot interactions. We employ the LLM-as-a-judge method to evaluate multiple LLMs for biased responses. Our findings show that some models exhibit more bias than others, highlighting the need for careful consideration when selecting models for deployment in educational and other high-stakes applications. These results emphasize the importance of addressing bias in LLMs and implementing robust mechanisms to uphold responsible AI use in real-world services.

Keywords