IEEE Access (Jan 2020)

DSCD: A Novel Deep Subspace Clustering Denoise Network for Single-Cell Clustering

  • Zhiye Wang,
  • Yiwen Lu,
  • Chang Yu,
  • Tao Zhou,
  • Ruiyi Li,
  • Siyun Hou

DOI
https://doi.org/10.1109/ACCESS.2020.3001986
Journal volume & issue
Vol. 8
pp. 109857 – 109865

Abstract

Read online

Single-cell RNA sequencing(scRNA-seq) technology has boomed in the past decade which makes it possible to study biological problems at the resolution of cellular-level. Currently, the research mainly focuses on exploring the cellular heterogeneity, involving studies about identifying cell type identification, cell lineage tracing, spatial model reconstruction of complex organizations, etc. Clustering analysis is always the most effective way in grouping single cells in previous studies. However, existing scRNA-seq clustering methods separate pre-processing and clustering tasks that complicated the problem. In addition, the emergence of big data further limits the traditional clustering algorithms' application on scRNA-seq data. Therefore, developing novel clustering methods and improving clustering accuracy for growing scRNA-seq data is a continuous task. In this paper, we propose a highly integrated Deep Subspace Clustering Denoise Network named DSCD, which integrates denoise, dimension reduction and clustering in a unified framework. Based on the neural network architecture of autoencoder, DSCD discovers the low dimensional latent structure within scRNA-seq data from the compressed representation. Furthermore, we add a novel self-expressive denoise layer to learning the global relationships between single cells, which is the main innovation of DSCD. Experimental results on the synthetic data demonstrate the effectiveness of the novel denoise layer. From the clustering results on 5 real scRNA-seq datasets, we find that DSCD outperforms the related subspace clustering algorithms and state of the art methods. In conclusion, DSCD responds well to the rapidly increasing scRNA-seq data scale, greatly reduces human interference in dimension reduction and handles the noisy scRNA-seq data in proper way thus obtain a higher clustering accuracy.

Keywords