Mayo Clinic Proceedings: Digital Health (Sep 2023)

Plastic Surgery and Artificial Intelligence: How ChatGPT Improved Operation Note Accuracy, Time, and Education

  • Ahmad M. Abdelhady, MBBCh, MRCS,
  • Christopher R. Davis, BSc, MB ChB, MRCS (Eng), PhD, FRCS (Plast)

Journal volume & issue
Vol. 1, no. 3
pp. 299 – 308

Abstract

Read online

Objectives: To evaluate the feasibility and effectiveness of employing a large language model (LLM), particularly the chat generative pretrained transformer (ChatGPT), as a supporting and educational tool for plastic surgeons. Patients and Methods: The study involved generating operative notes for plastic surgery procedures using ChatGPT-4 and comparing them with handwritten or computer-written notes for the same procedures. All operations were performed in a single institution from February 1, 2023, to April 20, 2023, by 4 surgeons. Data were compared using the Likert scale that included the following: procedure type; seniority of operating surgeon and operative note creator; type of note; time of surgeon generated note; time of artificial intelligence (AI)-generated note; adherence of AI note to current guidelines; surgeon satisfaction about AI-generated note; patient demographic characteristics; and patient satisfaction about AI-generated note. Results: ChatGPT-generated operative notes (n=30) took considerably less time to create than human-generated notes (5.1 seconds vs 7.10 minutes; P<.05), with 100% of the ChatGPT notes adhering to the current guidelines. Surgeons and patients expressed high satisfaction with the process of generating operative notes by the AI with (n=13) surgeons being very satisfied, (n=12) surgeons being satisfied, (n=4) surgeons being neutral, and only (n=1) surgeon was dissatisfied with the generated operative note. From a patient point of view, (n=13) patients were very satisfied, (n=13) patients were satisfied, (n=3) patients were neutral, and only (n=1) patient was dissatisfied. The mean time for surgeons to create a human note was 7.10 minutes (standard error of mean of 0.334 minutes). By contrast, the ChatGPT platform generated notes in 5.1 seconds (standard error of mean of 0.12 seconds), with an average of 2.1 edits needed per note. In addition, another AI tool enhanced the operative notes with realistic image creations to illustrate surgical incisions. Conclusion: LLM-generated notes are significantly faster to create, adhere to guidelines, and receive high user satisfaction from surgeons and patient perception. AI-generated hyper-realistic images further enhanced the operative notes. Although not able to replace human input, these findings show AI’s potential in the medical field and opportunities for future advancement.