Discover Artificial Intelligence (May 2025)

T2F: a domain-agnostic multi-agent framework for unstructured text to factuality evaluation items generation

  • Xin Tong,
  • Jingya Wang,
  • Yasen Aizezi,
  • Hanming Zhai,
  • Bo Jin

DOI
https://doi.org/10.1007/s44163-025-00294-w
Journal volume & issue
Vol. 5, no. 1
pp. 1 – 16

Abstract

Read online

Abstract Large language models (LLMs) demonstrate exceptional linguistic capabilities in text generation but remain prone to factual errors, particularly in specialized domains. Traditional factuality evaluation methods primarily rely on human annotation, which is costly, time-consuming, and difficult to generalize across different domains. To address these limitations, this study proposes an innovative multi-agent framework-T2F (Text-to-Factuality)-designed to automatically convert unstructured text into high-quality factuality evaluation datasets. T2F operates through the coordinated efforts of four specialized agents: Analysis, Information Extraction, Generation, and Validation. By systematically processing input data, T2F autonomously generates multiple types of assessment items-including single-choice questions, fill-in-the-blank questions, and true/false statements-without requiring human annotation, while maintaining strong cross-domain applicability. Experimental results demonstrate that T2F achieves data conversion success rates of 99% in the World Heritage domain, 98% in the Medical domain, and 85% in the Film domain. The generated data effectively assess LLMs’ factuality accuracy, highlighting T2F’s capability as a scalable and domain-agnostic factuality evaluation framework.

Keywords