PeerJ (Jul 2019)

BioShake: a Haskell EDSL for bioinformatics workflows

  • Justin Bedő

DOI
https://doi.org/10.7717/peerj.7223
Journal volume & issue
Vol. 7
p. e7223

Abstract

Read online Read online

Typical bioinformatics analyses comprise of long running computational workflows. An important part of reproducible research is the management and execution of these workflows to allow robust execution and to minimise errors. BioShake is an embedded domain specific language in Haskell for specifying and executing computational workflows for bioinformatics that significantly reduces the possibility of errors occurring. Unlike other workflow frameworks, BioShake raises many properties to the type level allowing the correctness of a workflow to be statically checked during compilation, catching errors before any lengthy execution process. BioShake builds on the Shake build tool to provide robust dependency tracking, parallel execution, reporting, and resumption capabilities. Finally, BioShake abstracts execution so that jobs can either be executed directly or submitted to a cluster. BioShake is available at http://github.com/PapenfussLab/bioshake.

Keywords