HemaSphere (Sep 2023)

The Gene Expression Classifier ALLCatchR Identifies B-cell Precursor ALL Subtypes and Underlying Developmental Trajectories Across Age

  • Thomas Beder,
  • Björn-Thore Hansen,
  • Alina M. Hartmann,
  • Johannes Zimmermann,
  • Eric Amelunxen,
  • Nadine Wolgast,
  • Wencke Walter,
  • Marketa Zaliova,
  • Željko Antić,
  • Philippe Chouvarine,
  • Lorenz Bartsch,
  • Malwine J. Barz,
  • Miriam Bultmann,
  • Johanna Horns,
  • Sonja Bendig,
  • Jan Kässens,
  • Christoph Kaleta,
  • Gunnar Cario,
  • Martin Schrappe,
  • Martin Neumann,
  • Nicola Gökbuget,
  • Anke Katharina Bergmann,
  • Jan Trka,
  • Claudia Haferlach,
  • Monika Brüggemann,
  • Claudia D. Baldus,
  • Lorenz Bastian

DOI
https://doi.org/10.1097/HS9.0000000000000939
Journal volume & issue
Vol. 7, no. 9
p. e939

Abstract

Read online

Current classifications (World Health Organization-HAEM5/ICC) define up to 26 molecular B-cell precursor acute lymphoblastic leukemia (BCP-ALL) disease subtypes by genomic driver aberrations and corresponding gene expression signatures. Identification of driver aberrations by transcriptome sequencing (RNA-Seq) is well established, while systematic approaches for gene expression analysis are less advanced. Therefore, we developed ALLCatchR, a machine learning-based classifier using RNA-Seq gene expression data to allocate BCP-ALL samples to all 21 gene expression-defined molecular subtypes. Trained on n = 1869 transcriptome profiles with established subtype definitions (4 cohorts; 55% pediatric / 45% adult), ALLCatchR allowed subtype allocation in 3 independent hold-out cohorts (n = 1018; 75% pediatric / 25% adult) with 95.7% accuracy (averaged sensitivity across subtypes: 91.1% / specificity: 99.8%). High-confidence predictions were achieved in 83.7% of samples with 98.9% accuracy. Only 1.2% of samples remained unclassified. ALLCatchR outperformed existing tools and identified novel driver candidates in previously unassigned samples. Additional modules provided predictions of samples blast counts, patient’s sex, and immunophenotype, allowing the imputation in cases where these information are missing. We established a novel RNA-Seq reference of human B-lymphopoiesis using 7 FACS-sorted progenitor stages from healthy bone marrow donors. Implementation in ALLCatchR enabled projection of BCP-ALL samples to this trajectory. This identified shared proximity patterns of BCP-ALL subtypes to normal lymphopoiesis stages, extending immunophenotypic classifications with a novel framework for developmental comparisons of BCP-ALL. ALLCatchR enables RNA-Seq routine application for BCP-ALL diagnostics with systematic gene expression analysis for accurate subtype allocation and novel insights into underlying developmental trajectories.