PLoS Computational Biology (Dec 2024)
Hierarchical marker genes selection in scRNA-seq analysis.
Abstract
When analyzing scRNA-seq data containing heterogeneous cell populations, an important task is to select informative marker genes to distinguish various cell clusters and annotate the clusters with biologically meaningful cell types. In existing analysis methods and pipelines, marker genes are typically identified using a one-vs-all strategy, examining differential expression between one cell cluster versus the combination of all other cell clusters. However, this strategy applied to cell clusters belonging to closely related cell types often generates overlapping marker genes, which capture the common signature of closely related cell clusters but provide limited information for distinguishing them. To address the limitations of the one-vs-all strategy, we propose a hierarchical marker gene selection strategy that groups similar cell clusters and selects marker genes in a hierarchical manner. This strategy is able to improve the accuracy and interpretability of cell type identification in single-cell RNA-seq data.