PLoS Medicine (Jun 2020)

Predicting and elucidating the etiology of fatty liver disease: A machine learning modeling and validation study in the IMI DIRECT cohorts.

  • Naeimeh Atabaki-Pasdar,
  • Mattias Ohlsson,
  • Ana Viñuela,
  • Francesca Frau,
  • Hugo Pomares-Millan,
  • Mark Haid,
  • Angus G Jones,
  • E Louise Thomas,
  • Robert W Koivula,
  • Azra Kurbasic,
  • Pascal M Mutie,
  • Hugo Fitipaldi,
  • Juan Fernandez,
  • Adem Y Dawed,
  • Giuseppe N Giordano,
  • Ian M Forgie,
  • Timothy J McDonald,
  • Femke Rutters,
  • Henna Cederberg,
  • Elizaveta Chabanova,
  • Matilda Dale,
  • Federico De Masi,
  • Cecilia Engel Thomas,
  • Kristine H Allin,
  • Tue H Hansen,
  • Alison Heggie,
  • Mun-Gwan Hong,
  • Petra J M Elders,
  • Gwen Kennedy,
  • Tarja Kokkola,
  • Helle Krogh Pedersen,
  • Anubha Mahajan,
  • Donna McEvoy,
  • Francois Pattou,
  • Violeta Raverdy,
  • Ragna S Häussler,
  • Sapna Sharma,
  • Henrik S Thomsen,
  • Jagadish Vangipurapu,
  • Henrik Vestergaard,
  • Leen M 't Hart,
  • Jerzy Adamski,
  • Petra B Musholt,
  • Soren Brage,
  • Søren Brunak,
  • Emmanouil Dermitzakis,
  • Gary Frost,
  • Torben Hansen,
  • Markku Laakso,
  • Oluf Pedersen,
  • Martin Ridderstråle,
  • Hartmut Ruetten,
  • Andrew T Hattersley,
  • Mark Walker,
  • Joline W J Beulens,
  • Andrea Mari,
  • Jochen M Schwenk,
  • Ramneek Gupta,
  • Mark I McCarthy,
  • Ewan R Pearson,
  • Jimmy D Bell,
  • Imre Pavo,
  • Paul W Franks

DOI
https://doi.org/10.1371/journal.pmed.1003149
Journal volume & issue
Vol. 17, no. 6
p. e1003149

Abstract

Read online

BackgroundNon-alcoholic fatty liver disease (NAFLD) is highly prevalent and causes serious health complications in individuals with and without type 2 diabetes (T2D). Early diagnosis of NAFLD is important, as this can help prevent irreversible damage to the liver and, ultimately, hepatocellular carcinomas. We sought to expand etiological understanding and develop a diagnostic tool for NAFLD using machine learning.Methods and findingsWe utilized the baseline data from IMI DIRECT, a multicenter prospective cohort study of 3,029 European-ancestry adults recently diagnosed with T2D (n = 795) or at high risk of developing the disease (n = 2,234). Multi-omics (genetic, transcriptomic, proteomic, and metabolomic) and clinical (liver enzymes and other serological biomarkers, anthropometry, measures of beta-cell function, insulin sensitivity, and lifestyle) data comprised the key input variables. The models were trained on MRI-image-derived liver fat content (ConclusionsIn this study, we developed several models with different combinations of clinical and omics data and identified biological features that appear to be associated with liver fat accumulation. In general, the clinical variables showed better prediction ability than the complex omics variables. However, the combination of omics and clinical variables yielded the highest accuracy. We have incorporated the developed clinical models into a web interface (see: https://www.predictliverfat.org/) and made it available to the community.Trial registrationClinicalTrials.gov NCT03814915.