Annals of Hepatology (Sep 2024)

Accurate prediction of all-cause mortality in patients with metabolic dysfunction-associated steatotic liver disease using electronic health records

  • Ignat Drozdov,
  • Benjamin Szubert,
  • Ian A. Rowe,
  • Timothy J. Kendall,
  • Jonathan A. Fallowfield

Journal volume & issue
Vol. 29, no. 5
p. 101528

Abstract

Read online

Introduction and Objectives: Despite the huge clinical burden of MASLD, validated tools for early risk stratification are lacking, and heterogeneous disease expression and a highly variable rate of progression to clinical outcomes result in prognostic uncertainty. We aimed to investigate longitudinal electronic health record-based outcome prediction in MASLD using a state-of-the-art machine learning model. Patients and Methods: n = 940 patients with histologically-defined MASLD were used to develop a deep-learning model for all-cause mortality prediction. Patient timelines, spanning 12 years, were fully-annotated with demographic/clinical characteristics, ICD-9 and -10 codes, blood test results, prescribing data, and secondary care activity. A Transformer neural network (TNN) was trained to output concomitant probabilities of 12-, 24-, and 36-month all-cause mortality. In-sample performance was assessed using 5-fold cross-validation. Out-of-sample performance was assessed in an independent set of n = 528 MASLD patients. Results: In-sample model performance achieved AUROC curve 0.74–0.90 (95 % CI: 0.72–0.94), sensitivity 64 %-82 %, specificity 75 %–92 % and Positive Predictive Value (PPV) 94 %-98 %. Out-of-sample model validation had AUROC 0.70–0.86 (95 % CI: 0.67–0.90), sensitivity 69 %–70 %, specificity 96 %–97 % and PPV 75 %–77 %. Key predictive factors, identified using coefficients of determination, were age, presence of type 2 diabetes, and history of hospital admissions with length of stay >14 days. Conclusions: A TNN, applied to routinely-collected longitudinal electronic health records, achieved good performance in prediction of 12-, 24-, and 36-month all-cause mortality in patients with MASLD. Extrapolation of our technique to population-level data will enable scalable and accurate risk stratification to identify people most likely to benefit from anticipatory health care and personalized interventions.

Keywords