Applied Sciences (Jan 2023)

Machine Learning Pipeline for the Automated Prediction of MicrovascularInvasion in HepatocellularCarcinomas

  • Riccardo Biondi,
  • Matteo Renzulli,
  • Rita Golfieri,
  • Nico Curti,
  • Gianluca Carlini,
  • Claudia Sala,
  • Enrico Giampieri,
  • Daniel Remondini,
  • Giulio Vara,
  • Arrigo Cattabriga,
  • Maria Adriana Cocozza,
  • Luigi Vincenzo Pastore,
  • Nicolò Brandi,
  • Antonino Palmeri,
  • Leonardo Scarpetti,
  • Gaia Tanzarella,
  • Matteo Cescon,
  • Matteo Ravaioli,
  • Gastone Castellani,
  • Francesca Coppola

DOI
https://doi.org/10.3390/app13031371
Journal volume & issue
Vol. 13, no. 3
p. 1371

Abstract

Read online

Background: Microvascular invasion (MVI) is a necessary step in the metastatic evolution of hepatocellular carcinoma liver tumors. Predicting the onset of MVI in the initial stages of the tumors could improve patient survival and the quality of life. In this study, the possibility of using radiomic features to predict the presence/absence of MVI was evaluated. Methods: Multiphase contrast-enhanced computed tomography (CECT) images were collected from 49 patients, and the radiomic features were extracted from the tumor region and the zone of transition. The most-relevant features were selected; the dataset was balanced, and the presence/absence of MVI was classified. The dataset was split into training and test sets in three ways using cross-validation: the first applied feature selection and dataset balancing outside cross-validation; the second applied dataset balancing outside and feature selection inside; the third applied the entire pipeline inside the cross-validation procedure. Results: The features from the tumor areas on CECT showed both the portal and the arterial phases to be the most predictive. The three pipelines showed receiver operating characteristic area under the curve (ROC AUC) scores of 0.89, 0.84, and 0.61, respectively. Conclusions: The results obtained confirmed the efficiency of multiphase CECT and the ZOT in detecting MVI. The results showed a significant difference in the performance of the three pipelines, highlighting that a non-rigorous pipeline design could lead to model performance and generalization capabilities that are too optimistic.

Keywords