GPT-3.5 Turbo and GPT-4 Turbo in Title and Abstract Screening for Systematic Reviews

Takehiko Oami; Yohei Okada; Taka-aki Nakada

doi:10.2196/64682

JMIR Medical Informatics (Mar 2025)

GPT-3.5 Turbo and GPT-4 Turbo in Title and Abstract Screening for Systematic Reviews

Takehiko Oami,
Yohei Okada,
Taka-aki Nakada

Affiliations

Takehiko Oami: ORCiD
Yohei Okada: ORCiD
Taka-aki Nakada: ORCiD

DOI: https://doi.org/10.2196/64682
Journal volume & issue: Vol. 13
pp. e64682 – e64682

Abstract

Read online

AbstractThis study demonstrated that while GPT-4 Turbo had superior specificity when compared to GPT-3.5 Turbo (0.98 vs 0.51), as well as comparable sensitivity (0.85 vs 0.83), GPT-3.5 Turbo processed 100 studies faster (0.9 min vs 1.6 min) in citation screening for systematic reviews, suggesting that GPT-4 Turbo may be more suitable due to its higher specificity and highlighting the potential of large language models in optimizing literature selection.

Published in JMIR Medical Informatics

ISSN: 2291-9694 (Online)
Publisher: JMIR Publications
Country of publisher: Canada
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://medinform.jmir.org

About the journal