Heliyon (Sep 2024)

Assessing the ChatGPT aptitude: A competent and effective Dermatology doctor?

  • Chengxiang Lian,
  • Xin Yuan,
  • Santosh Chokkakula,
  • Guanqing Wang,
  • Biao Song,
  • Zhe Wang,
  • Ge Fan,
  • Chengliang Yin

Journal volume & issue
Vol. 10, no. 17
p. e37220

Abstract

Read online

Background: The efficacy and adeptness of ChatGPT 3.5 and ChatGPT 4.0 in the precise diagnosis and management of conditions like atopic dermatitis and Autoimmune blistering skin diseases (AIBD) remain to be elucidated. So this study examined the accuracy and effectiveness of the ChatGPT responses related to understanding, therapies, and specific cases of these two conditions. Method: Firstly, the responses provided by ChatGPTs to a set of 50 questionnaires underwent evaluation by five distinct dermatologists, with complete adjudication of the third-party reviewer. The comparative analysis included the evaluative efficacy of both ChatGPT3.5 and ChatGPT4.0 against the diagnostic abilities exhibited by three distinct cohorts of qualified clinical professionals. And then, an examination was conducted to assess the diagnostic proficiency of ChatGPT3.5 and ChatGPT4.0 in the context of diagnosing specific instances of skin blistering autoimmune diseases. Results: In assessing the proficiency of ChatGPTs in generating responses related to fundamental knowledge about AD it is noteworthy that both versions of ChatGPTs, despite their lack of specialized training on medical databases, exhibited a commendable capacity to yield solutions that exhibited a substantial degree of concurrence with evidence-based medical information. Accordingly we observed that the performance of ChatGPT-4.0 beyond that of the ChatGPT-3.5. However, it it crucial to emphasize that ChatGPT-4.0 did not show the ability to offer answers surpassing those provided by associate senior, and senior medical professionals. In the assessment designed to determine the proficiency of ChatGPTs in recognizing particular type of AIBD, it is evident that both ChatGPT-4 and ChatGPT-3.5 demonstrated inadequacy in providing responses that are both precise and accurate for each individual occurrence of this skin condition. Conclusion: Both ChatGPT-3.5 and ChatGPT-4.0 satisfactory for addressing fundamental inquiries related to atopic dermatitis, however they prove insufficient for diagnosing AIBD. The progress of ChatGPT in achieving utility within the professional medical domain remains a considerable journey ahead.

Keywords