BMC Medicine (Jan 2024)
Interpretable artificial intelligence-based app assists inexperienced radiologists in diagnosing biliary atresia from sonographic gallbladder images
Abstract
Abstract Background A previously trained deep learning-based smartphone app provides an artificial intelligence solution to help diagnose biliary atresia from sonographic gallbladder images, but it might be impractical to launch it in real clinical settings. This study aimed to redevelop a new model using original sonographic images and their derived smartphone photos and then test the new model’s performance in assisting radiologists with different experiences to detect biliary atresia in real-world mimic settings. Methods A new model was first trained retrospectively using 3659 original sonographic gallbladder images and their derived 51,226 smartphone photos and tested on 11,410 external validation smartphone photos. Afterward, the new model was tested in 333 prospectively collected sonographic gallbladder videos from 207 infants by 14 inexperienced radiologists (9 juniors and 5 seniors) and 4 experienced pediatric radiologists in real-world mimic settings. Diagnostic performance was expressed as the area under the receiver operating characteristic curve (AUC). Results The new model outperformed the previously published model in diagnosing BA on the external validation set (AUC 0.924 vs 0.908, P = 0.004) with higher consistency (kappa value 0.708 vs 0.609). When tested in real-world mimic settings using 333 sonographic gallbladder videos, the new model performed comparable to experienced pediatric radiologists (average AUC 0.860 vs 0.876) and outperformed junior radiologists (average AUC 0.838 vs 0.773) and senior radiologists (average AUC 0.829 vs 0.749). Furthermore, the new model could aid both junior and senior radiologists to improve their diagnostic performances, with the average AUC increasing from 0.773 to 0.835 for junior radiologists and from 0.749 to 0.805 for senior radiologists. Conclusions The interpretable app-based model showed robust and satisfactory performance in diagnosing biliary atresia, and it could aid radiologists with limited experiences to improve their diagnostic performances in real-world mimic settings.
Keywords