Information Research: An International Electronic Journal (Mar 2025)
Are we there yet? Evaluation of AI-generated metadata for online information resources
Abstract
Introduction. Generative AI tools are increasingly used in creating descriptive metadata the quality of which is key for information discovery and support of information user tasks. Machine-readable online information resources such as websites naturally lend themselves to automatic metadata creation. Yet, assessments of AI-generated metadata for them are lacking. AI metadata quality research to date is limited to 2 metadata standards. Method. This experimental study assessed the quality of AI-generated descriptive metadata in 4 most widely used standards: Dublin core, MODS, MARC, and BIBFRAME. Three generative AI tools – Gemini, Gemini advanced, and ChatGPT4 – were used to create metadata for an educational website. Analysis. Zero-shot queries prompting AI tools to generate metadata followed the same structure and included the link to metadata scheme’s openly accessible documentation. Comparative in-depth analysis of accuracy and completeness of entire resulting AI-generated metadata records was performed. Results. Overall, AI-generated metadata does not meet the quality threshold. ChatGPT performs somewhat better than 2 other tools on completeness, but accuracy is similarly low in all 3 tools. Conclusions. Current metadata-generating effectiveness of AI tools does not allow to conclude that involvement of human metadata experts in creation of quality (and therefore functional) metadata can be significantly reduced without strong negative impact on information discovery.
Keywords