Nature Communications (Jul 2024)

A unified model-based framework for doublet or multiplet detection in single-cell multiomics data

  • Haoran Hu,
  • Xinjun Wang,
  • Site Feng,
  • Zhongli Xu,
  • Jing Liu,
  • Elisa Heidrich-O’Hare,
  • Yanshuo Chen,
  • Molin Yue,
  • Lang Zeng,
  • Ziqi Rong,
  • Tianmeng Chen,
  • Timothy Billiar,
  • Ying Ding,
  • Heng Huang,
  • Richard H. Duerr,
  • Wei Chen

DOI
https://doi.org/10.1038/s41467-024-49448-x
Journal volume & issue
Vol. 15, no. 1
pp. 1 – 16

Abstract

Read online

Abstract Droplet-based single-cell sequencing techniques rely on the fundamental assumption that each droplet encapsulates a single cell, enabling individual cell omics profiling. However, the inevitable issue of multiplets, where two or more cells are encapsulated within a single droplet, can lead to spurious cell type annotations and obscure true biological findings. The issue of multiplets is exacerbated in single-cell multiomics settings, where integrating cross-modality information for clustering can inadvertently promote the aggregation of multiplet clusters and increase the risk of erroneous cell type annotations. Here, we propose a compound Poisson model-based framework for multiplet detection in single-cell multiomics data. Leveraging experimental cell hashing results as the ground truth for multiplet status, we conducted trimodal DOGMA-seq experiments and generated 17 benchmarking datasets from two tissues, involving a total of 280,123 droplets. We demonstrated that the proposed method is an essential tool for integrating cross-modality multiplet signals, effectively eliminating multiplet clusters in single-cell multiomics data—a task at which the benchmarked single-omics methods proved inadequate.