Advanced Ultrasound in Diagnosis and Therapy (Dec 2024)
Evaluation of Liver Fibrosis on Grayscale Ultrasound in a Pediatric Population Using a Cloud-based Transfer Learning Artificial Intelligence Platform
Abstract
Objectives: The incidence of chronic liver diseases in children is increasing worldwide due to congenital, metabolic, autoimmune and viral diseases. Currently, liver biopsy for fibrosis assessment is considered the gold standard. However, this procedure is invasive, may result in unavoidable complications and is prone to sampling errors. These limitations have led to an increasing demand for noninvasive methods for fibrosis screening. Artificial intelligence integration in ultrasound diagnosis of liver fibrosis has gained interest in clinical research. In the current study we used a cloud-based artificial intelligence platform utilizing transfer learning to evaluate the accuracy of B-mode ultrasound based AI model compared to pediatric radiologists in detection of liver fibrosis in a pediatric population. Methods: For this IRB approved study, charts of 190 pediatric patients who were referred for liver biopsy and ultrasound were reviewed. On average 14 images of different liver areas were selected and a single image per decision was used for both radiologist and AI reads. A supervised machine learning model for image classification was developed using Google Vision AutoML (Google Inc., Mountain View, CA, USA). Data was divided for model development (80% of cases (154 cases) = 2324 images) and a model validation cohort for external testing (20% (36 cases) = 360 images). As a comparator, three blinded radiologists read the ultrasound images of the validation cohort and provided a binary diagnosis of fibrosis versus non fibrotic liver appearance. Tissue sampling was used as the reference standard for all cases. Results: There were 99 and 91 patients in the biopsy proven fibrosis and non-fibrosis group, respectively. The model’s internal evaluation resulted in precision of 78.2%, recall of 78.5% and average precision of 87.7%. In the external validation cohort, three radiologists (Mean ± Standard Deviation) and Google AutoML (confidence interval (CI)) achieved a sensitivity of 42.04% ± 0.04 and 70.56% (63.32% to 77.10% CI), specificity of 50.18% ± 0.04 and 45.00% (37.59% to 52.58% CI), positive predictive value of 45.76% ± 0.01 and 56.19% (52.17% to 60.14% CI), negative predictive value of 46.39% ± 0.01 and 60.45% (53.65% to 66.86% CI) and accuracy of 46.11% ± 0.01 and 57.78% (52.49% to 62.94% CI). When evaluating agreement across multiple images from the same patient, intra-reader agreement was 77.2% for AutoML and 90.8%-92.5% for the 3 radiologists. The models' F1 scores for the development and validation cohort were 0.78 and 0.62, respectively. Conclusions: Liver fibrosis assessment in children is challenging without biopsy. An ultrasound-based AI model showed high sensitivity compared to radiologists, albeit still without suitable diagnostic performance for clinical use.
Keywords