Genomics & Informatics (Mar 2020)

Bioinformatics services for analyzing massive genomic datasets

  • Gunhwan Ko,
  • Pan-Gyu Kim,
  • Youngbum Cho,
  • Seongmun Jeong,
  • Jae-Yoon Kim,
  • Kyoung Hyoun Kim,
  • Ho-Yeon Lee,
  • Jiyeon Han,
  • Namhee Yu,
  • Seokjin Ham,
  • Insoon Jang,
  • Byunghee Kang,
  • Sunguk Shin,
  • Lian Kim,
  • Seung-Won Lee,
  • Dougu Nam,
  • Jihyun F. Kim,
  • Namshin Kim,
  • Seon-Young Kim,
  • Sanghyuk Lee,
  • Tae-Young Roh,
  • Byungwook Lee

DOI
https://doi.org/10.5808/GI.2020.18.1.e8
Journal volume & issue
Vol. 18, no. 1

Abstract

Read online

The explosive growth of next-generation sequencing data has resulted in ultra-large-scale datasets and ensuing computational problems. In Korea, the amount of genomic data has been increasing rapidly in the recent years. Leveraging these big data requires researchers to use large-scale computational resources and analysis pipelines. A promising solution for addressing this computational challenge is cloud computing, where CPUs, memory, storage, and programs are accessible in the form of virtual machines. Here, we present a cloud computing-based system, Bio-Express, that provides user-friendly, cost-effective analysis of massive genomic datasets. Bio-Express is loaded with predefined multi-omics data analysis pipelines, which are divided into genome, transcriptome, epigenome, and metagenome pipelines. Users can employ predefined pipelines or create a new pipeline for analyzing their own omics data. We also developed several web-based services for facilitating downstream analysis of genome data. Bio-Express web service is freely available at https://www.bioexpress.re.kr/.

Keywords