Nature Communications (Jul 2024)

Pre-trained multimodal large language model enhances dermatological diagnosis using SkinGPT-4

  • Juexiao Zhou,
  • Xiaonan He,
  • Liyuan Sun,
  • Jiannan Xu,
  • Xiuying Chen,
  • Yuetan Chu,
  • Longxi Zhou,
  • Xingyu Liao,
  • Bin Zhang,
  • Shawn Afvari,
  • Xin Gao

DOI
https://doi.org/10.1038/s41467-024-50043-3
Journal volume & issue
Vol. 15, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Large language models (LLMs) are seen to have tremendous potential in advancing medical diagnosis recently, particularly in dermatological diagnosis, which is a very important task as skin and subcutaneous diseases rank high among the leading contributors to the global burden of nonfatal diseases. Here we present SkinGPT-4, which is an interactive dermatology diagnostic system based on multimodal large language models. We have aligned a pre-trained vision transformer with an LLM named Llama-2-13b-chat by collecting an extensive collection of skin disease images (comprising 52,929 publicly available and proprietary images) along with clinical concepts and doctors’ notes, and designing a two-step training strategy. We have quantitatively evaluated SkinGPT-4 on 150 real-life cases with board-certified dermatologists. With SkinGPT-4, users could upload their own skin photos for diagnosis, and the system could autonomously evaluate the images, identify the characteristics and categories of the skin conditions, perform in-depth analysis, and provide interactive treatment recommendations.