Information (Apr 2025)

Exploring Large Language Models’ Ability to Describe Entity-Relationship Schema-Based Conceptual Data Models

  • Andrea Avignone,
  • Alessia Tierno,
  • Alessandro Fiori,
  • Silvia Chiusano

DOI
https://doi.org/10.3390/info16050368
Journal volume & issue
Vol. 16, no. 5
p. 368

Abstract

Read online

In the field of databases, Large Language Models (LLMs) have recently been studied for generating SQL queries from textual descriptions, while their use for conceptual or logical data modeling remains less explored. The conceptual design of relational databases commonly relies on the entity-relationship (ER) data model, where translation rules enable mapping an ER schema into corresponding relational tables with their constraints. Our study investigates the capability of LLMs to describe in natural language a database conceptual data model based on the ER schema. Whether for documentation, onboarding, or communication with non-technical stakeholders, LLMs can significantly improve the process of explaining the ER schema by generating accurate descriptions about how the components interact as well as the represented information. To guide the LLM with challenging constructs, specific hints are defined to provide an enriched ER schema. Different LLMs have been explored (ChatGPT 3.5 and 4, Llama2, Gemini, Mistral 7B) and different metrics (F1 score, ROUGE, perplexity) are used to assess the quality of the generated descriptions and compare the different LLMs.

Keywords