Development of machine learning-based personalized predictive models for risk evaluation of hepatocellular carcinoma in hepatitis B virus-related cirrhosis patients with low levels of serum alpha-fetoprotein

Yuan Xu; Bei Zhang; Fan Zhou; Ying-ping Yi; Xin-Lei Yang; Xiao Ouyang; Hui Hu

Annals of Hepatology (Nov 2024)

Development of machine learning-based personalized predictive models for risk evaluation of hepatocellular carcinoma in hepatitis B virus-related cirrhosis patients with low levels of serum alpha-fetoprotein

Yuan Xu,
Bei Zhang,
Fan Zhou,
Ying-ping Yi,
Xin-Lei Yang,
Xiao Ouyang,
Hui Hu

Affiliations

Yuan Xu: Medical Big Data Center, the Second Affiliated Hospital of Nanchang University, Nanchang, PR China
Bei Zhang: Department of Gastroenterology, the Second Affiliated Hospital of Nanchang University, Nanchang, PR China
Fan Zhou: Department of Hepatobiliary Surgery, the Second Affiliated Hospital of Nanchang University, Nanchang, PR China
Ying-ping Yi: Medical Big Data Center, the Second Affiliated Hospital of Nanchang University, Nanchang, PR China
Xin-Lei Yang: Medical Big Data Center, the Second Affiliated Hospital of Nanchang University, Nanchang, PR China
Xiao Ouyang: Quiclinic Technology Co., Ltd., Nanchang, PR China
Hui Hu: Medical Big Data Center, the Second Affiliated Hospital of Nanchang University, Nanchang, PR China; Corresponding author.

Journal volume & issue: Vol. 29, no. 6
p. 101540

Abstract

Read online

Introduction and Objectives: The increasing incidence of hepatocellular carcinoma (HCC) in China is an urgent issue, necessitating early diagnosis and treatment. This study aimed to develop personalized predictive models by combining machine learning (ML) technology with a demographic, medical history, and noninvasive biomarker data. These models can enhance the decision-making capabilities of physicians for HCC in hepatitis B virus (HBV)-related cirrhosis patients with low serum alpha-fetoprotein (AFP) levels. Patients and Methods: A total of 6,980 patients treated between January 2012 and December 2018 were included. Pre-treatment laboratory tests and clinical data were obtained. The significant risk factors for HCC were identified, and the relative risk of each variable affecting its diagnosis was calculated using ML and univariate regression analysis. The data set was then randomly partitioned into validation (20 %) and training sets (80 %) to develop the ML models. Results: Twelve independent risk factors for HCC were identified using Gaussian naïve Bayes, extreme gradient boosting (XGBoost), random forest, and least absolute shrinkage and selection operation regression models. Multivariate analysis revealed that male sex, age >60 years, alkaline phosphate >150 U/L, AFP >25 ng/mL, carcinoembryonic antigen >5 ng/mL, and fibrinogen >4 g/L were the risk factors, whereas hypertension, calcium 6.8 μmol/L, hemoglobin 40 U/L were the protective factors in HCC patients. Based on these factors, a nomogram was constructed, showing an area under the curve (AUC) of 0.746 (sensitivity = 0.710, specificity=0.646), which was significantly higher than AFP AUC of 0.658 (sensitivity = 0.462, specificity=0.766). Compared with several ML algorithms, the XGBoost model had an AUC of 0.832 (sensitivity = 0.745, specificity=0.766) and an independent validation AUC of 0.829 (sensitivity = 0.766, specificity = 0.737), making it the top-performing model in both sets. The external validation results have proven the accuracy of the XGBoost model. Conclusions: The proposed XGBoost demonstrated a promising ability for individualized prediction of HCC in HBV-related cirrhosis patients with low-level AFP.

Published in Annals of Hepatology

ISSN: 1665-2681 (Print); 2659-5982 (Online)
Publisher: Elsevier
Country of publisher: Spain
LCC subjects: Medicine: Internal medicine: Specialties of internal medicine
Website: https://www.journals.elsevier.com/annals-of-hepatology

About the journal

Abstract

Keywords