BMC Bioinformatics (Nov 2021)

Nested Stochastic Block Models applied to the analysis of single cell data

  • Leonardo Morelli,
  • Valentina Giansanti,
  • Davide Cittaro

DOI
https://doi.org/10.1186/s12859-021-04489-7
Journal volume & issue
Vol. 22, no. 1
pp. 1 – 19

Abstract

Read online

Abstract Single cell profiling has been proven to be a powerful tool in molecular biology to understand the complex behaviours of heterogeneous system. The definition of the properties of single cells is the primary endpoint of such analysis, cells are typically clustered to underpin the common determinants that can be used to describe functional properties of the cell mixture under investigation. Several approaches have been proposed to identify cell clusters; while this is matter of active research, one popular approach is based on community detection in neighbourhood graphs by optimisation of modularity. In this paper we propose an alternative and principled solution to this problem, based on Stochastic Block Models. We show that such approach not only is suitable for identification of cell groups, it also provides a solid framework to perform other relevant tasks in single cell analysis, such as label transfer. To encourage the use of Stochastic Block Models, we developed a python library, schist, that is compatible with the popular scanpy framework.