Assessing the potential role of ChatGPT in spine surgery research

Isabel Herzog; Dhruv Mendiratta; Ashok Para; Ari Berg; Neil Kaushal; Michael Vives

doi:10.1002/jeo2.12057

Journal of Experimental Orthopaedics (Jul 2024)

Assessing the potential role of ChatGPT in spine surgery research

Isabel Herzog,
Dhruv Mendiratta,
Ashok Para,
Ari Berg,
Neil Kaushal,
Michael Vives

Affiliations

Isabel Herzog: Rutgers New Jersey Medical School Newark New Jersey USA
Dhruv Mendiratta: Rutgers New Jersey Medical School Newark New Jersey USA
Ashok Para: Rutgers New Jersey Medical School Newark New Jersey USA
Ari Berg: Rutgers New Jersey Medical School Newark New Jersey USA
Neil Kaushal: Rutgers New Jersey Medical School Newark New Jersey USA
Michael Vives: Rutgers New Jersey Medical School Newark New Jersey USA

DOI: https://doi.org/10.1002/jeo2.12057
Journal volume & issue: Vol. 11, no. 3
pp. n/a – n/a

Abstract

Read online

Abstract Purpose Since its release in November 2022, Chat Generative Pre‐Trained Transformer 3.5 (ChatGPT), a complex machine learning model, has garnered more than 100 million users worldwide. The aim of this study is to determine how well ChatGPT can generate novel systematic review ideas on topics within spine surgery. Methods ChatGPT was instructed to give ten novel systematic review ideas for five popular topics in spine surgery literature: microdiscectomy, laminectomy, spinal fusion, kyphoplasty and disc replacement. A comprehensive literature search was conducted in PubMed, CINAHL, EMBASE and Cochrane. The number of nonsystematic review articles and number of systematic review papers that had been published on each ChatGPT‐generated idea were recorded. Results Overall, ChatGPT had a 68% accuracy rate in creating novel systematic review ideas. More specifically, the accuracy rates were 80%, 80%, 40%, 70% and 70% for microdiscectomy, laminectomy, spinal fusion, kyphoplasty and disc replacement, respectively. However, there was a 32% rate of ChatGPT generating ideas for which there were 0 nonsystematic review articles published. There was a 71.4%, 50%, 22.2%, 50%, 62.5% and 51.2% success rate of generating novel systematic review ideas, for which there were also nonsystematic reviews published, for microdiscectomy, laminectomy, spinal fusion, kyphoplasty, disc replacement and overall, respectively. Conclusions ChatGPT generated novel systematic review ideas at an overall rate of 68%. ChatGPT can help identify knowledge gaps in spine research that warrant further investigation, when used under supervision of an experienced spine specialist. This technology can be erroneous and lacks intrinsic logic; so, it should never be used in isolation. Level of Evidence Not applicable.

Published in Journal of Experimental Orthopaedics

ISSN: 2197-1153 (Online)
Publisher: Wiley
Country of publisher: United States
LCC subjects: Medicine: Surgery: Orthopedic surgery
Website: https://esskajournals.onlinelibrary.wiley.com/journal/21971153

About the journal

Abstract

Keywords