BMC Bioinformatics (Dec 2023)

scInterpreter: a knowledge-regularized generative model for interpretably integrating scRNA-seq data

  • Zhen-Hao Guo,
  • Yan Wu,
  • Siguo Wang,
  • Qinhu Zhang,
  • Jin-Ming Shi,
  • Yan-Bin Wang,
  • Zhan-Heng Chen

DOI
https://doi.org/10.1186/s12859-023-05579-4
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Background The rapid emergence of single-cell RNA-seq (scRNA-seq) data presents remarkable opportunities for broad investigations through integration analyses. However, most integration models are black boxes that lack interpretability or are hard to train. Results To address the above issues, we propose scInterpreter, a deep learning-based interpretable model. scInterpreter substantially outperforms other state-of-the-art (SOTA) models in multiple benchmark datasets. In addition, scInterpreter is extensible and can integrate and annotate atlas scRNA-seq data. We evaluated the robustness of scInterpreter in a variety of situations. Through comparison experiments, we found that with a knowledge prior, the training process can be significantly accelerated. Finally, we conducted interpretability analysis for each dimension (pathway) of cell representation in the embedding space. Conclusions The results showed that the cell representations obtained by scInterpreter are full of biological significance. Through weight sorting, we found several new genes related to pathways in PBMC dataset. In general, scInterpreter is an effective and interpretable integration tool. It is expected that scInterpreter will bring great convenience to the study of single-cell transcriptomics.

Keywords