Heliyon (Nov 2024)

A machine learning-based predictive model discriminates nonalcoholic steatohepatitis from nonalcoholic fatty liver disease

  • Yuqi Yan,
  • Danhui Gan,
  • Ping Zhang,
  • Haizhu Zou,
  • MinMin Li

Journal volume & issue
Vol. 10, no. 21
p. e38848

Abstract

Read online

Background: Non-alcoholic fatty liver disease (NAFLD) is a leading cause of liver-related morbidity and mortality. The diagnosis of non-alcoholic steatohepatitis (NASH) plays a crucial role in the management of NAFLD patients. Objective: The aim of our observational study was to build a machine learning model to identify NASH in NAFLD patients. Methods: The clinical characteristics of 259 NAFLD patients and their initial laboratory data (Cohort 1) were collected to train the model and carry out internal validation. We compared the models built by five machine learning algorithms and screened out the best models. Receiver operating characteristic (ROC) curves, sensitivity, specificity, and accuracy were used to evaluate the performance of the model. In addition, the NAFLD patients in Cohort 2 (n = 181) were externally verified. Results: We finally identified six independent risk factors for predicting NASH, including neutrophil percentage (NEU%), aspartate aminotransferase/alanine aminotransferase (AST/ALT), hematocrit (HCT), creatinine (CREA), uric acid (UA), and prealbumin (PA). The NASH-XGB6 model built using the XGBoost algorithm showed sufficient prediction accuracy, with ROC values of 0.95 (95 % CI, 0.91–0.98) and 0.90 (95 % CI, 0.88–0.93) in Cohort 1 and Cohort 2, respectively. Conclusions: NASH-XGB6 can serve as an effective tool for distinguishing NASH patients from NAFLD patients.

Keywords