BMC Research Notes (Jul 2017)
fastQ_brew: module for analysis, preprocessing, and reformatting of FASTQ sequence data
Abstract
Abstract Background Next generation sequencing datasets are stored as FASTQ formatted files. In order to avoid downstream artefacts, it is critical to implement a robust preprocessing protocol of the FASTQ sequence in order to determine the integrity and quality of the data. Results Here I describe fastQ_brew which is a package that provides a suite of methods to evaluate sequence data in FASTQ format and efficiently implements a variety of manipulations to filter sequence data by size, quality and/or sequence. fastQ_brew allows for mismatch searches to adapter sequences, left and right end trimming, removal of duplicate reads, as well as reads containing non-designated bases. fastQ_brew also returns summary statistics on the unfiltered and filtered FASTQ data, and offers FASTQ to FASTA conversion as well as FASTQ reverse complement and DNA to RNA manipulations. Conclusions fastQ_brew is open source and freely available to all users at the following webpage: https://github.com/dohalloran/fastQ_brew .
Keywords