Frontiers in Medicine (Sep 2024)
Diagnostic accuracy of deep learning in detection and prognostication of renal cell carcinoma: a systematic review and meta-analysis
Abstract
IntroductionThe prevalence of Renal cell carcinoma (RCC) is increasing among adults. Histopathologic samples obtained after surgical resection or from biopsies of a renal mass require subtype classification for diagnosis, prognosis, and to determine surveillance. Deep learning in artificial intelligence (AI) and pathomics are rapidly advancing, leading to numerous applications such as histopathological diagnosis. In our meta-analysis, we assessed the pooled diagnostic performances of deep neural network (DNN) frameworks in detecting RCC subtypes and to predicting survival.MethodsA systematic search was done in PubMed, Google Scholar, Embase, and Scopus from inception to November 2023. The random effects model was used to calculate the pooled percentages, mean, and 95% confidence interval. Accuracy was defined as the number of cases identified by AI out of the total number of cases, i.e. (True Positive + True Negative)/(True Positive + True Negative + False Positive + False Negative). The heterogeneity between study-specific estimates was assessed by the I2 statistic. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines were used to conduct and report the analysis.ResultsThe search retrieved 347 studies; 13 retrospective studies evaluating 5340 patients were included in the final analysis. The pooled performance of the DNN was as follows: accuracy 92.3% (95% CI: 85.8–95.9; I2 = 98.3%), sensitivity 97.5% (95% CI: 83.2–99.7; I2 = 92%), specificity 89.2% (95% CI: 29.9–99.4; I2 = 99.6%) and area under the curve 0.91 (95% CI: 0.85–0.97.3; I2 = 99.6%). Specifically, their accuracy in RCC subtype detection was 93.5% (95% CI: 88.7–96.3; I2 = 92%), and the accuracy in survival analysis prediction was 81% (95% CI: 67.8–89.6; I2 = 94.4%).DiscussionThe DNN showed excellent pooled diagnostic accuracy rates to classify RCC into subtypes and grade them for prognostic purposes. Further studies are required to establish generalizability and validate these findings on a larger scale.
Keywords