Using large language models to extract information from pediatric clinical reports.

Katharina Danhauser; Yingding Wang; Christoph Klein; Uta Tacke; Larissa Mantoan; Laura Aurica Ritter; Florian Heinen; Chiara Nobile; Moritz Tacke

doi:10.1371/journal.pdig.0000919

PLOS Digital Health (Jul 2025)

Using large language models to extract information from pediatric clinical reports.

Katharina Danhauser,
Yingding Wang,
Christoph Klein,
Uta Tacke,
Larissa Mantoan,
Laura Aurica Ritter,
Florian Heinen,
Chiara Nobile,
Moritz Tacke

Affiliations

Katharina Danhauser
Yingding Wang
Christoph Klein
Uta Tacke
Larissa Mantoan
Laura Aurica Ritter
Florian Heinen
Chiara Nobile
Moritz Tacke

DOI: https://doi.org/10.1371/journal.pdig.0000919
Journal volume & issue: Vol. 4, no. 7
p. e0000919

Abstract

Read online

Most medical documentation, including clinical reports, exists in unstructured formats, which hinder efficient data analysis and integration into decision-making systems for patient care and research. Both fields could profit significantly from a reliable automatic analysis of these documents. Current methods for data extraction from these documents are labor-intensive and inflexible. Large Language Models (LLMs) offer a promising alternative for transforming unstructured medical documents into structured data in a flexible manner. This study assesses the performance of large language models (LLMs) in extracting structured data from pediatric clinical reports. Nine different LLMs were assessed. The results demonstrate that both commercial and open-source LLMs can achieve high accuracy in identifying patient-specific information, with top-performing models achieving over 90% accuracy in key tasks.

Published in PLOS Digital Health

ISSN: 2767-3170 (Online)
Publisher: Public Library of Science (PLoS)
Country of publisher: United States
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://journals.plos.org/digitalhealth/

About the journal