Computers and Education: Artificial Intelligence (Jun 2025)

Assessing how accurately large language models encode and apply the common European framework of reference for languages

  • Luca Benedetto,
  • Gabrielle Gaudeau,
  • Andrew Caines,
  • Paula Buttery

Journal volume & issue
Vol. 8
p. 100353

Abstract

Read online

Large Language Models (LLMs) can have a transformative effect on a variety of domains, including education, and it is therefore pressing to understand whether these models have knowledge of – or, in other words, how they have encoded – the specific pedagogical requirements of different educational domains, and whether they use this when performing educational tasks. In this work, we propose an approach to evaluate the knowledge – or encoding – that the LLMs have of the Common European Framework of Reference for Languages (CEFR), and use it to evaluate five modern LLMs. Our study shows that the suite of tasks we propose is quite challenging for all the LLMs, and they often provide results which are not satisfactory and would be unusable in educational applications, suggesting that – even if they encode some information about the CEFR – this knowledge is not really leveraged when performing downstream tasks.

Keywords