IEEE Access (Jan 2019)

GPU Implementation of Pairwise Gaussian Mixture Models for Multi-Modal Gene Co-Expression Networks

  • Benjamin T. Shealy,
  • Josh J. R. Burns,
  • Melissa C. Smith,
  • F. Alex Feltus,
  • Stephen P. Ficklin

DOI
https://doi.org/10.1109/ACCESS.2019.2951284
Journal volume & issue
Vol. 7
pp. 160845 – 160857

Abstract

Read online

Gene co-expression networks (GCNs) are widely used in bioinformatics research to perform system-level analyses of organisms based on the pairwise correlation between all expressed genes. For large datasets which contain samples from multiple sources, gene pairs can exhibit multiple modes of co-expression which confound typical correlation approaches. A clustering method such as Gaussian Mixture Models (GMMs) may be used to separate the modes of each gene pair in an unsupervised manner, prior to computing the correlation of each mode. However, pairwise clustering significantly increases the computational cost of constructing a GCN, as several clustering models must be evaluated for each gene pair, and the number of gene pairs grows rapidly with the number of genes. In this paper, we present a heterogeneous, high-throughput multi-CPU/GPU software package for multi-modal GCN construction, implemented in version 3 of the Knowledge Independent Network Construction (KINC) software. We determine the optimal values for several execution parameters of the GPU implementation, and we benchmark our CPU and GPU implementations for up to 8 CPUs/GPUs. Our GPU implementation achieves a 167$\times$ speedup over the corresponding CPU implementation, as well as a 500$\times$ speedup over KINCv1.

Keywords