Genome Biology (Nov 2023)

Evaluation of deep learning-based feature selection for single-cell RNA sequencing data analysis

  • Hao Huang,
  • Chunlei Liu,
  • Manoj M. Wagle,
  • Pengyi Yang

DOI
https://doi.org/10.1186/s13059-023-03100-x
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 20

Abstract

Read online

Abstract Background Feature selection is an essential task in single-cell RNA-seq (scRNA-seq) data analysis and can be critical for gene dimension reduction and downstream analyses, such as gene marker identification and cell type classification. Most popular methods for feature selection from scRNA-seq data are based on the concept of differential distribution wherein a statistical model is used to detect changes in gene expression among cell types. Recent development of deep learning-based feature selection methods provides an alternative approach compared to traditional differential distribution-based methods in that the importance of a gene is determined by neural networks. Results In this work, we explore the utility of various deep learning-based feature selection methods for scRNA-seq data analysis. We sample from Tabula Muris and Tabula Sapiens atlases to create scRNA-seq datasets with a range of data properties and evaluate the performance of traditional and deep learning-based feature selection methods for cell type classification, feature selection reproducibility and diversity, and computational time. Conclusions Our study provides a reference for future development and application of deep learning-based feature selection methods for single-cell omics data analyses.