PLoS Computational Biology (Jun 2021)

SDImpute: A statistical block imputation method based on cell-level and gene-level information for dropouts in single-cell RNA-seq data.

  • Jing Qi,
  • Yang Zhou,
  • Zicen Zhao,
  • Shuilin Jin

DOI
https://doi.org/10.1371/journal.pcbi.1009118
Journal volume & issue
Vol. 17, no. 6
p. e1009118

Abstract

Read online

The single-cell RNA sequencing (scRNA-seq) technologies obtain gene expression at single-cell resolution and provide a tool for exploring cell heterogeneity and cell types. As the low amount of extracted mRNA copies per cell, scRNA-seq data exhibit a large number of dropouts, which hinders the downstream analysis of the scRNA-seq data. We propose a statistical method, SDImpute (Single-cell RNA-seq Dropout Imputation), to implement block imputation for dropout events in scRNA-seq data. SDImpute automatically identifies the dropout events based on the gene expression levels and the variations of gene expression across similar cells and similar genes, and it implements block imputation for dropouts by utilizing gene expression unaffected by dropouts from similar cells. In the experiments, the results of the simulated datasets and real datasets suggest that SDImpute is an effective tool to recover the data and preserve the heterogeneity of gene expression across cells. Compared with the state-of-the-art imputation methods, SDImpute improves the accuracy of the downstream analysis including clustering, visualization, and differential expression analysis.