Egyptian Journal of Forensic Sciences (May 2025)
Assessing ChatGPT-4’s ability to generate forensic reports: a study of artificial intelligence in forensics
Abstract
Abstract Background The rise of artificial intelligence in the field of medicine has had a wide-ranging impact, from clinical applications to research. Artificial intelligence-based language models such as Chat Generative Pre-Trained Transformer have the potential to expedite the process of forensic reporting, particularly in forensic medicine. This study aims to evaluate the forensic reporting capabilities of Chat Generative Pre-Trained Transformer-4 in comparison with forensic medicine assistants. The Turkish Penal Code-related forensic medicine guide and 20 case examples were used to train Chat Generative Pre-Trained Transformer-4 in this study. Chat Generative Pre-Trained Transformer-4 was asked to write reports like forensic medicine specialists. In the retrospective phase, 100 forensic cases were assessed by Chat Generative Pre-Trained Transformer-4, while in the prospective phase, 266 new cases were assessed by both Chat Generative Pre-Trained Transformer-4 and forensic medicine assistants. Two forensic medicine specialists assessed the accuracy of these reports in terms of adherence to the forensic medicine guide. Results Chat Generative Pre-Trained Transformer-4 achieved an accuracy rate of 96.6% in the retrospective phase and 96.2% in the prospective phase for the combined categories of “Life-threatening” and “Simple Medical Intervention”. Forensic medicine assistants, however, demonstrated a higher accuracy rate of 99.1% in these categories compared to Chat Generative Pre-Trained Transformer-4. Conclusions The success of Chat Generative Pre-Trained Transformer-4 indicates that the combination of technology and human expertise could establish new standards in forensic reporting. However, it is emphasized that supervision by forensic medicine specialists remains crucial in this process.
Keywords