Machine learning approaches to enhance diagnosis and staging of patients with MASLD using routinely available clinical information.

Matthew McTeer; Douglas Applegate; Peter Mesenbrink; Vlad Ratziu; Jörn M Schattenberg; Elisabetta Bugianesi; Andreas Geier; Manuel Romero Gomez; Jean-Francois Dufour; Mattias Ekstedt; Sven Francque; Hannele Yki-Jarvinen; Michael Allison; Luca Valenti; Luca Miele; Michael Pavlides; Jeremy Cobbold; Georgios Papatheodoridis; Adriaan G Holleboom; Dina Tiniakos; Clifford Brass; Quentin M Anstee; Paolo Missier; LITMUS Consortium investigators

doi:10.1371/journal.pone.0299487

PLoS ONE (Jan 2024)

Machine learning approaches to enhance diagnosis and staging of patients with MASLD using routinely available clinical information.

Matthew McTeer,
Douglas Applegate,
Peter Mesenbrink,
Vlad Ratziu,
Jörn M Schattenberg,
Elisabetta Bugianesi,
Andreas Geier,
Manuel Romero Gomez,
Jean-Francois Dufour,
Mattias Ekstedt,
Sven Francque,
Hannele Yki-Jarvinen,
Michael Allison,
Luca Valenti,
Luca Miele,
Michael Pavlides,
Jeremy Cobbold,
Georgios Papatheodoridis,
Adriaan G Holleboom,
Dina Tiniakos,
Clifford Brass,
Quentin M Anstee,
Paolo Missier,
LITMUS Consortium investigators

Affiliations

Matthew McTeer
Douglas Applegate
Peter Mesenbrink
Vlad Ratziu
Jörn M Schattenberg
Elisabetta Bugianesi
Andreas Geier
Manuel Romero Gomez
Jean-Francois Dufour
Mattias Ekstedt
Sven Francque
Hannele Yki-Jarvinen
Michael Allison
Luca Valenti
Luca Miele
Michael Pavlides
Jeremy Cobbold
Georgios Papatheodoridis
Adriaan G Holleboom
Dina Tiniakos
Clifford Brass
Quentin M Anstee
Paolo Missier
LITMUS Consortium investigators

DOI: https://doi.org/10.1371/journal.pone.0299487
Journal volume & issue: Vol. 19, no. 2
p. e0299487

Abstract

Read online

AimsMetabolic dysfunction Associated Steatotic Liver Disease (MASLD) outcomes such as MASH (metabolic dysfunction associated steatohepatitis), fibrosis and cirrhosis are ordinarily determined by resource-intensive and invasive biopsies. We aim to show that routine clinical tests offer sufficient information to predict these endpoints.MethodsUsing the LITMUS Metacohort derived from the European NAFLD Registry, the largest MASLD dataset in Europe, we create three combinations of features which vary in degree of procurement including a 19-variable feature set that are attained through a routine clinical appointment or blood test. This data was used to train predictive models using supervised machine learning (ML) algorithm XGBoost, alongside missing imputation technique MICE and class balancing algorithm SMOTE. Shapley Additive exPlanations (SHAP) were added to determine relative importance for each clinical variable.ResultsAnalysing nine biopsy-derived MASLD outcomes of cohort size ranging between 5385 and 6673 subjects, we were able to predict individuals at training set AUCs ranging from 0.719-0.994, including classifying individuals who are At-Risk MASH at an AUC = 0.899. Using two further feature combinations of 26-variables and 35-variables, which included composite scores known to be good indicators for MASLD endpoints and advanced specialist tests, we found predictive performance did not sufficiently improve. We are also able to present local and global explanations for each ML model, offering clinicians interpretability without the expense of worsening predictive performance.ConclusionsThis study developed a series of ML models of accuracy ranging from 71.9-99.4% using only easily extractable and readily available information in predicting MASLD outcomes which are usually determined through highly invasive means.

Published in PLoS ONE

ISSN: 1932-6203 (Online)
Publisher: Public Library of Science (PLoS)
Country of publisher: United States
LCC subjects: Medicine; Science
Website: https://journals.plos.org/plosone/

About the journal