Nature Communications (Jan 2025)

A deep multiple instance learning framework improves microsatellite instability detection from tumor next generation sequencing

  • John Ziegler,
  • Jaclyn F. Hechtman,
  • Satshil Rana,
  • Ryan N. Ptashkin,
  • Gowtham Jayakumaran,
  • Sumit Middha,
  • Shweta S. Chavan,
  • Chad Vanderbilt,
  • Deborah DeLair,
  • Jacklyn Casanova,
  • Jinru Shia,
  • Nicole DeGroat,
  • Ryma Benayed,
  • Marc Ladanyi,
  • Michael F. Berger,
  • Thomas J. Fuchs,
  • A. Rose Brannon,
  • Ahmet Zehir

DOI
https://doi.org/10.1038/s41467-024-54970-z
Journal volume & issue
Vol. 16, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Microsatellite instability (MSI) is a critical phenotype of cancer genomes and an FDA-recognized biomarker that can guide treatment with immune checkpoint inhibitors. Previous work has demonstrated that next-generation sequencing data can be used to identify samples with MSI-high phenotype. However, low tumor purity, as frequently observed in routine clinical samples, poses a challenge to the sensitivity of existing algorithms. To overcome this critical issue, we developed MiMSI, an MSI classifier based on deep neural networks and trained using a dataset that included low tumor purity MSI cases in a multiple instance learning framework. On a challenging yet representative set of cases, MiMSI showed higher sensitivity (0.895) and auROC (0.971) than MSISensor (sensitivity: 0.67; auROC: 0.907), an open-source software previously validated for clinical use at our institution using MSK-IMPACT large panel targeted NGS data. In a separate, prospective cohort, MiMSI confirmed that it outperforms MSISensor in low purity cases (P = 8.244e-07).