Assessing ChatGPT-4’s ability to generate forensic reports: a study of artificial intelligence in forensics

Halit Canberk Aydogan; Büşra Yıkar; Hüseyin Balandız; Sait Özsoy

doi:10.1186/s41935-025-00445-1

Egyptian Journal of Forensic Sciences (May 2025)

Assessing ChatGPT-4’s ability to generate forensic reports: a study of artificial intelligence in forensics

Halit Canberk Aydogan,
Büşra Yıkar,
Hüseyin Balandız,
Sait Özsoy

Affiliations

Halit Canberk Aydogan: Department of Forensic Medicine, Faculty of Medicine, Ordu University Training and Research Hospital
Büşra Yıkar: Department of Forensic Medicine, Health Sciences University Gülhane Faculty of Medicine
Hüseyin Balandız: Department of Forensic Medicine, Health Sciences University Gülhane Faculty of Medicine
Sait Özsoy: Department of Forensic Medicine, Health Sciences University Gülhane Faculty of Medicine

DOI: https://doi.org/10.1186/s41935-025-00445-1
Journal volume & issue: Vol. 15, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Background The rise of artificial intelligence in the field of medicine has had a wide-ranging impact, from clinical applications to research. Artificial intelligence-based language models such as Chat Generative Pre-Trained Transformer have the potential to expedite the process of forensic reporting, particularly in forensic medicine. This study aims to evaluate the forensic reporting capabilities of Chat Generative Pre-Trained Transformer-4 in comparison with forensic medicine assistants. The Turkish Penal Code-related forensic medicine guide and 20 case examples were used to train Chat Generative Pre-Trained Transformer-4 in this study. Chat Generative Pre-Trained Transformer-4 was asked to write reports like forensic medicine specialists. In the retrospective phase, 100 forensic cases were assessed by Chat Generative Pre-Trained Transformer-4, while in the prospective phase, 266 new cases were assessed by both Chat Generative Pre-Trained Transformer-4 and forensic medicine assistants. Two forensic medicine specialists assessed the accuracy of these reports in terms of adherence to the forensic medicine guide. Results Chat Generative Pre-Trained Transformer-4 achieved an accuracy rate of 96.6% in the retrospective phase and 96.2% in the prospective phase for the combined categories of “Life-threatening” and “Simple Medical Intervention”. Forensic medicine assistants, however, demonstrated a higher accuracy rate of 99.1% in these categories compared to Chat Generative Pre-Trained Transformer-4. Conclusions The success of Chat Generative Pre-Trained Transformer-4 indicates that the combination of technology and human expertise could establish new standards in forensic reporting. However, it is emphasized that supervision by forensic medicine specialists remains crucial in this process.

Published in Egyptian Journal of Forensic Sciences

ISSN: 2090-536X (Print); 2090-5939 (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Law: Law in general. Comparative and uniform law. Jurisprudence; Medicine: Medicine (General)
Website: https://ejfs.springeropen.com/

About the journal

Abstract

Keywords