Genome Biology (May 2024)

scCDC: a computational method for gene-specific contamination detection and correction in single-cell and single-nucleus RNA-seq data

  • Weijian Wang,
  • Yihui Cen,
  • Zezhen Lu,
  • Yueqing Xu,
  • Tianyi Sun,
  • Ying Xiao,
  • Wanlu Liu,
  • Jingyi Jessica Li,
  • Chaochen Wang

DOI
https://doi.org/10.1186/s13059-024-03284-w
Journal volume & issue
Vol. 25, no. 1
pp. 1 – 29

Abstract

Read online

Abstract In droplet-based single-cell and single-nucleus RNA-seq assays, systematic contamination of ambient RNA molecules biases the quantification of gene expression levels. Existing methods correct the contamination for all genes globally. However, there lacks specific evaluation of correction efficacy for varying contamination levels. Here, we show that DecontX and CellBender under-correct highly contaminating genes, while SoupX and scAR over-correct lowly/non-contaminating genes. Here, we develop scCDC as the first method to detect the contamination-causing genes and only correct expression levels of these genes, some of which are cell-type markers. Compared with existing decontamination methods, scCDC excels in decontaminating highly contaminating genes while avoiding over-correction of other genes.