npj Digital Medicine (May 2024)
A systematic review and meta-analysis of artificial intelligence versus clinicians for skin cancer diagnosis
Abstract
Abstract Scientific research of artificial intelligence (AI) in dermatology has increased exponentially. The objective of this study was to perform a systematic review and meta-analysis to evaluate the performance of AI algorithms for skin cancer classification in comparison to clinicians with different levels of expertise. Based on PRISMA guidelines, 3 electronic databases (PubMed, Embase, and Cochrane Library) were screened for relevant articles up to August 2022. The quality of the studies was assessed using QUADAS-2. A meta-analysis of sensitivity and specificity was performed for the accuracy of AI and clinicians. Fifty-three studies were included in the systematic review, and 19 met the inclusion criteria for the meta-analysis. Considering all studies and all subgroups of clinicians, we found a sensitivity (Sn) and specificity (Sp) of 87.0% and 77.1% for AI algorithms, respectively, and a Sn of 79.78% and Sp of 73.6% for all clinicians (overall); differences were statistically significant for both Sn and Sp. The difference between AI performance (Sn 92.5%, Sp 66.5%) vs. generalists (Sn 64.6%, Sp 72.8%), was greater, when compared with expert clinicians. Performance between AI algorithms (Sn 86.3%, Sp 78.4%) vs expert dermatologists (Sn 84.2%, Sp 74.4%) was clinically comparable. Limitations of AI algorithms in clinical practice should be considered, and future studies should focus on real-world settings, and towards AI-assistance.