Cancer Imaging (Oct 2024)

Cross-institutional evaluation of deep learning and radiomics models in predicting microvascular invasion in hepatocellular carcinoma: validity, robustness, and ultrasound modality efficacy comparison

  • Weibin Zhang,
  • Qihui Guo,
  • Yuli Zhu,
  • Meng Wang,
  • Tong Zhang,
  • Guangwen Cheng,
  • Qi Zhang,
  • Hong Ding

DOI
https://doi.org/10.1186/s40644-024-00790-9
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Purpose To conduct a head-to-head comparison between deep learning (DL) and radiomics models across institutions for predicting microvascular invasion (MVI) in hepatocellular carcinoma (HCC) and to investigate the model robustness and generalizability through rigorous internal and external validation. Methods This retrospective study included 2304 preoperative images of 576 HCC lesions from two centers, with MVI status determined by postoperative histopathology. We developed DL and radiomics models for predicting the presence of MVI using B-mode ultrasound, contrast-enhanced ultrasound (CEUS) at the arterial, portal, and delayed phases, and a combined modality (B + CEUS). For radiomics, we constructed models with enlarged vs. original regions of interest (ROIs). A cross-validation approach was performed by training models on one center’s dataset and validating the other, and vice versa. This allowed assessment of the validity of different ultrasound modalities and the cross-center robustness of the models. The optimal model combined with alpha-fetoprotein (AFP) was also validated. The head-to-head comparison was based on the area under the receiver operating characteristic curve (AUC). Results Thirteen DL models and 25 radiomics models using different ultrasound modalities were constructed and compared. B + CEUS was the optimal modality for both DL and radiomics models. The DL model achieved AUCs of 0.802–0.818 internally and 0.667–0.688 externally across the two centers, whereas radiomics achieved AUCs of 0.749–0.869 internally and 0.646–0.697 externally. The radiomics models showed overall improvement with enlarged ROIs (P 0.05 for all modalities, 1.6–2.1% differences in AUC for the optimal modality), whereas the radiomics models had relatively limited robustness across the two centers (12% drop-off in AUC for the optimal modality). Adding AFP improved the DL models (P 0.05). Conclusion Cross-institutional validation indicated that DL demonstrated better robustness than radiomics for preoperative MVI prediction in patients with HCC, representing a promising solution to non-standardized ultrasound examination procedures.

Keywords