Genome Biology (Aug 2021)

MultiK: an automated tool to determine optimal cluster numbers in single-cell RNA sequencing data

  • Siyao Liu,
  • Aatish Thennavan,
  • Joseph P. Garay,
  • J. S. Marron,
  • Charles M. Perou

DOI
https://doi.org/10.1186/s13059-021-02445-5
Journal volume & issue
Vol. 22, no. 1
pp. 1 – 21

Abstract

Read online

Abstract Single-cell RNA sequencing (scRNA-seq) provides new opportunities to characterize cell populations, typically accomplished through some type of clustering analysis. Estimation of the optimal cluster number (K) is a crucial step but often ignored. Our approach improves most current scRNA-seq cluster methods by providing an objective estimation of the number of groups using a multi-resolution perspective. MultiK is a tool for objective selection of insightful Ks and achieves high robustness through a consensus clustering approach. We demonstrate that MultiK identifies reproducible groups in scRNA-seq data, thus providing an objective means to estimating the number of possible groups or cell-type populations present.

Keywords