Scientific Reports (Feb 2024)

cnnImpute: missing value recovery for single cell RNA sequencing data

  • Wenjuan Zhang,
  • Brandon Huckaby,
  • John Talburt,
  • Sherman Weissman,
  • Mary Qu Yang

DOI
https://doi.org/10.1038/s41598-024-53998-x
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 14

Abstract

Read online

Abstract The advent of single-cell RNA sequencing (scRNA-seq) technology has revolutionized our ability to explore cellular diversity and unravel the complexities of intricate diseases. However, due to the inherently low signal-to-noise ratio and the presence of an excessive number of missing values, scRNA-seq data analysis encounters unique challenges. Here, we present cnnImpute, a novel convolutional neural network (CNN) based method designed to address the issue of missing data in scRNA-seq. Our approach starts by estimating missing probabilities, followed by constructing a CNN-based model to recover expression values with a high likelihood of being missing. Through comprehensive evaluations, cnnImpute demonstrates its effectiveness in accurately imputing missing values while preserving the integrity of cell clusters in scRNA-seq data analysis. It achieved superior performance in various benchmarking experiments. cnnImpute offers an accurate and scalable method for recovering missing values, providing a useful resource for scRNA-seq data analysis.