BMC Bioinformatics (Jul 2019)

PS-MCL: parallel shotgun coarsened Markov clustering of protein interaction networks

  • Yongsub Lim,
  • Injae Yu,
  • Dongmin Seo,
  • U Kang,
  • Lee Sael

DOI
https://doi.org/10.1186/s12859-019-2856-8
Journal volume & issue
Vol. 20, no. S13
pp. 1 – 12

Abstract

Read online

Abstract Background How can we obtain fast and high-quality clusters in genome scale bio-networks? Graph clustering is a powerful tool applied on bio-networks to solve various biological problems such as protein complexes detection, disease module detection, and gene function prediction. Especially, MCL (Markov Clustering) has been spotlighted due to its superior performance on bio-networks. MCL, however, is skewed towards finding a large number of very small clusters (size 1-3) and fails to detect many larger clusters (size 10+). To resolve this fragmentation problem, MLR-MCL (Multi-level Regularized MCL) has been developed. MLR-MCL still suffers from the fragmentation and, in cases, unrealistically large clusters are generated. Results In this paper, we propose PS-MCL (Parallel Shotgun Coarsened MCL), a parallel graph clustering method outperforming MLR-MCL in terms of running time and cluster quality. PS-MCL adopts an efficient coarsening scheme, called SC (Shotgun Coarsening), to improve graph coarsening in MLR-MCL. SC allows merging multiple nodes at a time, which leads to improvement in quality, time and space usage. Also, PS-MCL parallelizes main operations used in MLR-MCL which includes matrix multiplication. Conclusions Experiments show that PS-MCL dramatically alleviates the fragmentation problem, and outperforms MLR-MCL in quality and running time. We also show that the running time of PS-MCL is effectively reduced with parallelization.

Keywords