JMIR Medical Informatics (Mar 2025)
GPT-3.5 Turbo and GPT-4 Turbo in Title and Abstract Screening for Systematic Reviews
Abstract
AbstractThis study demonstrated that while GPT-4 Turbo had superior specificity when compared to GPT-3.5 Turbo (0.98 vs 0.51), as well as comparable sensitivity (0.85 vs 0.83), GPT-3.5 Turbo processed 100 studies faster (0.9 min vs 1.6 min) in citation screening for systematic reviews, suggesting that GPT-4 Turbo may be more suitable due to its higher specificity and highlighting the potential of large language models in optimizing literature selection.