JHEP Reports (Feb 2025)

Deep learning helps discriminate between autoimmune hepatitis and primary biliary cholangitis

  • Alessio Gerussi,
  • Oliver Lester Saldanha,
  • Giorgio Cazzaniga,
  • Damiano Verda,
  • Zunamys I. Carrero,
  • Bastian Engel,
  • Richard Taubert,
  • Francesca Bolis,
  • Laura Cristoferi,
  • Federica Malinverno,
  • Francesca Colapietro,
  • Reha Akpinar,
  • Luca Di Tommaso,
  • Luigi Terracciano,
  • Ana Lleo,
  • Mauro Viganó,
  • Cristina Rigamonti,
  • Daniela Cabibi,
  • Vincenza Calvaruso,
  • Fabio Gibilisco,
  • Nicoló Caldonazzi,
  • Alessandro Valentino,
  • Stefano Ceola,
  • Valentina Canini,
  • Eugenia Nofit,
  • Marco Muselli,
  • Julien Calderaro,
  • Dina Tiniakos,
  • Vincenzo L’Imperio,
  • Fabio Pagni,
  • Nicola Zucchini,
  • Pietro Invernizzi,
  • Marco Carbone,
  • Jakob Nikolas Kather

Journal volume & issue
Vol. 7, no. 2
p. 101198

Abstract

Read online

Background & Aims: Biliary abnormalities in autoimmune hepatitis (AIH) and interface hepatitis in primary biliary cholangitis (PBC) occur frequently, and misinterpretation may lead to therapeutic mistakes with a negative impact on patients. This study investigates the use of a deep learning (DL)-based pipeline for the diagnosis of AIH and PBC to aid differential diagnosis. Methods: We conducted a multicenter study across six European referral centers, and built a library of digitized liver biopsy slides dating from 1997 to 2023. A training set of 354 cases (266 AIH and 102 PBC) and an external validation set of 92 cases (62 AIH and 30 PBC) were available for analysis. A novel DL model, the autoimmune liver neural estimator (ALNE), was trained on whole-slide images (WSIs) with H&E staining, without human annotations. The ALNE model was evaluated against clinico-pathological diagnoses and tested for interobserver variability among general pathologists. Results: The ALNE model demonstrated high accuracy in differentiating AIH from PBC, achieving an area under the receiver operating characteristic curve of 0.81 in external validation. Attention heatmaps showed that ALNE tends to focus more on areas with increased inflammation, associating such patterns predominantly with AIH. A multivariate explainable ML model revealed that PBC cases misclassified as AIH more often had ALP values between 1 × upper limit of normal (ULN) and 2 × ULN, coupled with AST values above 1 × ULN. Inconsistency among general pathologists was noticed when evaluating a random sample of the same cases (Fleiss’s kappa value 0.09). Conclusions: The ALNE model is the first system generating a quantitative and accurate differential diagnosis between cases with AIH or PBC. Impact and implications: This study demonstrates the significant potential of the autoimmune liver neural estimator model, a transformer-based deep learning system, in accurately distinguishing between autoimmune hepatitis and primary biliary cholangitis using digitized liver biopsy slides without human annotation. The scientific justification for this work lies in addressing the challenge of differentiating these conditions, which often present with overlapping features and can lead to therapeutic mistakes. In addition, there is need for quantitative assessment of information embedded in liver biopsies, which are currently evaluated on qualitative or semi-quantitative methods. The results of this study are crucial for pathologists, researchers, and clinicians, providing a reliable diagnostic tool that reduces interobserver variability and improves diagnostic accuracy of these conditions. Potential methodological limitations, such as the diversity in scanning techniques and slide colorations, were considered, ensuring the robustness and generalizability of the findings.

Keywords